Welcome to the documentation of the i6-core recipes!

This is the documentation of the public recipe collection of the RWTH i6 lab for the Sisyphus workflow manager.

The repository is still under construction, so please wait for any updates to the source code and the documentation.

API

i6_core.adaptation.ivector

i6_core.adaptation.linear_adaptation_layer

i6_core.adaptation.ubm

i6_core.am.config

i6_core.am.score_features

i6_core.audio.encoding

class i6_core.audio.encoding.BlissChangeEncodingJob(*args, **kwargs)

Uses ffmpeg to convert all audio files of a bliss corpus (file format, encoding, channel layout)

For all parameter holds that “None” means to use the ffmpeg defaults, which depend on the input file and the output format specified.

Parameters:
  • corpus_file – bliss corpus

  • output_format – output file ending to determine container format (without dot)

  • sample_rate – target sample rate of the audio

  • codec – specify the codec, codecs are listed with ffmpeg -codecs

  • codec_options – specify additional codec specific options (be aware of potential conflicts with “fixed bitrate” and “sample_rate”)

  • fixed_bitrate – a target bitrate (be aware that not all codecs support all bitrates)

  • force_num_channels – specify the channel number, exceeding channels will be merged

  • select_channels – tuple of (channel_layout, channel_name), see ffmpeg -layouts this is useful if the new encoding might have an effect on the duration, or if no duration was specified in the source corpus

  • ffmpeg_binary – path to a ffmpeg binary, uses system “ffmpeg” if None

  • hash_binary – In some cases it might be required to work with a specific ffmpeg version, in which case the binary needs to be hashed

  • recover_duration – This will open all files with “soundfile” and extract the length information. There might be minimal differences when converting the encoding, so only set this to False if you’re willing to accept this risk. None (default) means that the duration is recovered if either output_format or codec is specified because this might possibly lead to duration mismatches.

  • in_codec – specify the codec of the input file

  • in_codec_options – specify additional codec specific options for the in_codec

i6_core.audio.ffmpeg

class i6_core.audio.ffmpeg.BlissFfmpegJob(*args, **kwargs)

Applies an FFMPEG audio filter to all recordings of a bliss corpus. This Job is extremely generic, as any valid audio option/filter string will work. Please consider using more specific jobs that use this Job as super class, see e.g. BlissChangeEncodingJob

WARNING:
  • This job assumes that file names of individual recordings are unique across the whole corpus.

  • Do not change the duration of the audio files when you have multiple segments per audio, as the segment information will be incorrect afterwards.

Typical applications:

Changing Audio Format/Encoding

  • specify in output_format what container you want to use. If the filter string is empty (“”), ffmepg will automatically use a default encoding option

  • specify specific encoding with -c:a <codec>. For a list of available codecs and their options see https://ffmpeg.org/ffmpeg-codecs.html#Audio-Encoders

  • specify a fixed bitrate with -b:a <bit_rate>, e.g. 64k. Variable bitrate options depend on the used encoder, refer to the online documentation in this case

  • specify a sample rate with -ar <sample_rate>. FFMPEG will do proper resampling, so the speed of the audio is NOT changed.

Changing Channel Layout

  • for detailed informations see https://trac.ffmpeg.org/wiki/AudioChannelManipulation

  • convert to mono -ac 1

  • selecting a specific audio channel: -filter_complex [0:a]channelsplit=channel_layout=stereo:channels=FR[right] -map [right] For a list of channels/layouts use ffmpeg -layouts

Simple Filter Syntax

For a list of available filters see: https://ffmpeg.org/ffmpeg-filters.html

-af <filter_name>=<first_param>=<first_param_value>:<second_param>=<second_param_value>

Complex Filter Syntax

-filter_complex [<input>]<simple_syntax>[<output>];[<input>]<simple_syntax>[<output>];…

Inputs and outputs can be namend arbitrarily, but the default stream 0 audio can be accessed with [0:a]

The output stream that should be written into the audio is defined with -map [<output_stream>]

IMPORTANT! Do not forget to add and escape additional quotation marks correctly for parameters to`-af` or -filter_complex

Parameters:
  • corpus_file – bliss corpus

  • ffmpeg_options – list of additional ffmpeg parameters

  • recover_duration – if the filter changes the duration of the audio, set to True

  • output_format – output file ending to determine container format (without dot)

  • ffmpeg_binary – path to a ffmpeg binary, uses system “ffmpeg” if None

  • hash_binary – In some cases it might be required to work with a specific ffmpeg version, in which case the binary needs to be hashed

  • ffmpeg_input_options – list of ffmpeg parameters thare are applied for reading the input files

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
run_recover_duration()

Open all files with “soundfile” and extract the length information

Returns:

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.bpe.apply

This is an old location of bpe jobs kept for backwards compatibility, for new setups using the subword-nmt based BPE, please use i6_core.label.bpe, for other setups please switch to the sentencepiece implementation

class i6_core.bpe.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)

Apply BPE codes to a Bliss lexicon file

Parameters:
  • bliss_lexicon (Path) –

  • bpe_codes (Path) –

  • bpe_vocab (Path|None) –

  • subword_nmt_repo (Optional[Path]) –

class i6_core.bpe.apply.ApplyBPEToTextJob(*args, **kwargs)

Apply BPE codes on a text file

Parameters:
  • text_file – words text file to convert to bpe

  • bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes

  • bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab

  • subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob

  • gzip_output – use gzip on the output text

  • mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed

i6_core.bpe.train

This is an old location of bpe jobs kept for backwards compatibility, for new setups using the subword-nmt based BPE, please use i6_core.label.bpe, for other setups please switch to the sentencepiece implementation

class i6_core.bpe.train.ReturnnTrainBpeJob(*args, **kwargs)

Create Bpe codes and vocab files compatible with RETURNN BytePairEncoding Repository:

Parameters:
  • text_file – corpus text file, .gz compressed or uncompressed

  • bpe_size (int) – number of BPE merge operations

  • unk_label (str) – unknown label

  • subword_nmt_repo (Path|None) – subword nmt repository path. see also CloneGitRepositoryJob

class i6_core.bpe.train.TrainBPEModelJob(*args, **kwargs)

Create a bpe codes file using the official subword-nmt repo, either installed from pip or https://github.com/rsennrich/subword-nmt

Parameters:
  • text_corpus (Path) –

  • symbols (int) –

  • min_frequency (int) –

  • dict_input (bool) –

  • total_symbols (bool) –

  • subword_nmt_repo (Optional[Path]) –

i6_core.cart.estimate

class i6_core.cart.estimate.AccumulateCartStatisticsJob(*args, **kwargs)

Goes over all training data and for each triphone state accumulates the values and squared values of the given feature flow

Parameters:
accumulate(task_id)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_accumulate_config(crp, alignment_flow, extra_config_accumulate, extra_post_config_accumulate, **kwargs)
Parameters:
Returns:

Return type:

(rasr.config.RasrConfig, rasr.config.RasrConfig)

create_files()
classmethod create_merge_config(crp, extra_config_merge, extra_post_config_merge, **kwargs)
Parameters:
Returns:

Return type:

(rasr.config.RasrConfig, rasr.config.RasrConfig)

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.cart.estimate.EstimateCartJob(*args, **kwargs)

This job estimates a phonetic decision tree. Given a set of accumulated (squared) feature values a single gaussian model is estimated per triphone state. Then iteratively states are merged according to the provided questions such that the log-likelihood of the resulting models is minimized. Finally states which have a low number of occurrences are merged into the closest cluster.

Parameters:
cleanup_before_run(*args)
classmethod create_config(crp, questions, cart_examples, variance_clipping, generate_cluster_file, extra_config, extra_post_config, **kwargs)
Parameters:
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.cart.questions

class i6_core.cart.questions.BasicCartQuestions(phoneme_path, max_leaves, min_obs)
get_questions()
load_phonemes_from_file()
write_to_file(file)
class i6_core.cart.questions.BeepCartQuestions(include_central_phoneme=True, *args, **kwargs)
get_questions()
class i6_core.cart.questions.CMUCartQuestions(include_central_phoneme=True, *args, **kwargs)
get_questions()
class i6_core.cart.questions.PythonCartQuestions(phonemes, steps, max_leaves=9001, hmm_states=3)
get_questions()
write_to_file(file)

i6_core.corpus.convert

class i6_core.corpus.convert.CorpusReplaceOrthFromReferenceCorpus(*args, **kwargs)

Copies the orth tag from one corpus to another through matching segment names.

Parameters:
  • bliss_corpus – Corpus in which the orth tag is to be replaced

  • reference_bliss_corpus – Corpus from which the orth tag replacement is taken

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.convert.CorpusReplaceOrthFromTxtJob(*args, **kwargs)

Merge raw text back into a bliss corpus

Parameters:
  • bliss_corpus (Path) – Bliss corpus

  • text_file (Path) – a raw or gzipped text file

  • segment_file (Path|None) – only replace the segments as specified in the segment file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.convert.CorpusToStmJob(*args, **kwargs)

Convert a Bliss corpus into a .stm file

Parameters:
  • bliss_corpus – Path to Bliss corpus

  • exclude_non_speech – non speech tokens should be removed

  • non_speech_tokens – defines the list of non speech tokens

  • remove_punctuation – should punctuation be removed

  • punctuation_tokens – defines list/string of punctuation tokens

  • fix_whitespace – should white space be fixed. !!!be aware that the corpus loading already fixes white space!!!

  • name – new corpus name

  • tag_mapping – 3-string tuple contains (“short name”, “long name”, “description”) of each tag. and the Dict[int, tk.Path] is e.g. the out_single_segment_files of a FilterSegments*Jobs

classmethod replace_recursive(orthography, token)

recursion is required to find repeated tokens string.replace is not sufficient some other solution might also work

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.convert.CorpusToTextDictJob(*args, **kwargs)

Extract the Text from a Bliss corpus to fit a “{key: text}” structure (e.g. for RETURNN)

Parameters:
  • bliss_corpus (Path) – bliss corpus file

  • segment_file (Path|None) – a segment file as optional whitelist

  • invert_match (bool) – use segment file as blacklist (needs to contain full segment names then)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.convert.CorpusToTxtJob(*args, **kwargs)

Extract orth from a Bliss corpus and store as raw txt or gzipped txt

Parameters:
  • bliss_corpus (Path) – Bliss corpus

  • segment_file (Path) – segment file

  • gzip (bool) – gzip the output text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.costa

class i6_core.corpus.costa.CostaJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, eval_recordings, eval_lm, extra_config, extra_post_config)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.data_augmentation

class i6_core.corpus.data_augmentation.ChangeCorpusSpeedJob(*args, **kwargs)

Changes the speed of all audio files in the corpus (shifting time AND frequency)

Parameters:
  • bliss_corpus (Path) – Bliss corpus

  • corpus_name (str) – name of the new corpus

  • speed_factor (float) – relative speed factor

  • base_frequency (int) – sampling rate of the audio files

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.data_augmentation.SelfNoiseCorpusJob(*args, **kwargs)

Add noise to each recording in the corpus. The noise consists of audio data from other recordings in the corpus and is reduced by the given SNR. Only supports .wav files

WARNING: This Job uses /dev/shm for performance reasons, please be cautious

Parameters:
  • bliss_corpus (Path) – Bliss corpus with wav files

  • snr (float) – signal to noise ratio in db, positive values only

  • corpus_name (str) – name of the new corpus

  • n_noise_tracks (int) – number of random (parallel) utterances to add

  • seed (int) – seed for random utterance selection

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.filter

class i6_core.corpus.filter.FilterCorpusBySegmentDurationJob(*args, **kwargs)
Parameters:
  • bliss_corpus – path of the corpus file

  • min_duration – minimum duration for a segment to keep (in seconds)

  • max_duration – maximum duration for a segment to keep (in seconds)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.filter.FilterCorpusBySegmentsJob(*args, **kwargs)
Parameters:
  • bliss_corpus

  • segment_file – a single segment file or a list of segment files

  • compressed

  • invert_match

  • delete_empty_recordings – if true, empty recordings will be removed

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.filter.FilterCorpusRemoveUnknownWordSegmentsJob(*args, **kwargs)

Filter segments of a bliss corpus if there are unknowns with respect to a given lexicon

Parameters:
  • bliss_corpus

  • bliss_lexicon

  • case_sensitive – consider casing for check against lexicon

  • all_unknown – all words have to be unknown in order for the segment to be discarded

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.filter.FilterSegmentsByAlignmentConfidenceJob(*args, **kwargs)
Parameters:
  • alignment_logs – alignment_job.out_log_file; task_id -> log_file

  • percentile – percent of alignment segments to keep. should be in (0,100]. for np.percentile()

  • crp – used to set the number of output segments. if none, number of alignment log files is used instead.

  • plot – plot the distribution of alignment scores

  • absolute_threshold – alignments with score above this number are discarded

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.filter.FilterSegmentsByListJob(*args, **kwargs)

Filters segment list file using a given list of segments, which is either used as black or as white list :param segment_files: original segment list files to be filtered :param filter_list: list used for filtering or a path to a text file with the entries of that list one per line :param invert_match: black list (if False) or white list (if True) usage

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.filter.FilterSegmentsByRegexJob(*args, **kwargs)

Filters segment list file using a given regular expression :param segment_files: original segment list files to be filtered :param filter_regex: regex used for filtering :param invert_match: keep segment if regex does not match (if False) or does match (if True)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.segments

class i6_core.corpus.segments.DynamicSplitSegmentFileJob(*args, **kwargs)

Split the segments to concurrent many shares. It is a variant to the existing SplitSegmentFileJob. This requires a tk.Delayed variable (instead of int) for the argument concurrent.

Parameters:
  • segment_file (tk.Path|str) – segment file

  • concurrent (tk.Delayed) – number of splits

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.SegmentCorpusByRegexJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.SegmentCorpusBySpeakerJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.SegmentCorpusJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.ShuffleAndSplitSegmentsJob(*args, **kwargs)
default_split = {'dev': 0.1, 'train': 0.9}
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.SortSegmentsByLengthAndShuffleJob(*args, **kwargs)
Parameters:
  • crp – rasr.crp.CommonRasrParameters

  • shuffle_strength – float in [0,inf) determines how much the length should affect sorting 0 -> completely random; inf -> strictly sorted

  • shuffle_seed – random number seed

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.SplitSegmentFileJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.segments.UpdateSegmentsWithSegmentMapJob(*args, **kwargs)

Update a segment file with a segment mapping file (e.g. from corpus compression)

Parameters:
  • segment_file (Path) – path to the segment text file (uncompressed)

  • segment_map (Path) – path to the segment map (gz or uncompressed)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.speaker

class i6_core.corpus.speaker.CorpusAddSpeakerTagsFromMappingJob(*args, **kwargs)

Adds speaker tags from given mapping defined by dictonary to corpus

Parameters:
  • corpus – Corpus to add tags to

  • mapping – pickled dictionary that defines a mapping corpus fullname -> speaker id

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.speaker.CorpusRemoveSpeakerTagsJob(*args, **kwargs)

Remove speaker tags from given corpus

Parameters:

corpus – Corpus to remove the tags from

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.stats

class i6_core.corpus.stats.CountCorpusWordFrequenciesJob(*args, **kwargs)

Extracts a list of words and their counts in the provided bliss corpus

Parameters:

bliss_corpus (Path) – path to corpus file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.stats.ExtractOovWordsFromCorpusJob(*args, **kwargs)

Extracts the out of vocabulary words based on a given corpus and lexicon

Parameters:
  • bliss_corpus (Union[Path, str]) – path to corpus file

  • bliss_lexicon (Union[Path, str]) – path to lexicon

  • casing (str) – changes the casing of the orthography (options: upper, lower, none) str.upper() is problematic for german since ß -> SS https://bugs.python.org/issue34928

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.corpus.transform

class i6_core.corpus.transform.AddCacheToCorpusJob(*args, **kwargs)

Adds cache manager call to all audio paths in a corpus file :param Path bliss_corpus: bliss corpora file path

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.ApplyLexiconToCorpusJob(*args, **kwargs)

Use a bliss lexicon to convert all words in a bliss corpus into their phoneme representation.

Currently only supports picking the first phoneme.

Parameters:
  • bliss_corpus (Path) – path to a bliss corpus xml

  • bliss_lexicon (Path) – path to a bliss lexicon file

  • word_separation_orth (str|None) – a default word separation lemma orth. The corresponding phoneme (or phonemes in some special cases) are inserted between each word. Usually it makes sense to use something like “[SILENCE]” or “[space]” or so).

  • strategy (LexiconStrategy) – strategy to determine which representation is selected

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.CompressCorpusJob(*args, **kwargs)

Compresses a corpus by concatenating audio files and using a compression codec. Does currently not support corpora with subcorpora, files need to be .wav :param Path bliss_corpus: path to an xml corpus file with wave recordings :param str format: supported file formats, currently limited to mp3 :param str bitrate: bitrate as string, e.g. ‘32k’ or ‘192k’, can also be an integer e.g. 192000 :param int max_num_splits: maximum number of resulting audio files.

add_duration_to_recordings(c)

open each recording, extract the duration and add the duration to the recording object # TODO: this is a lengthy operation, but so far there was no alternative… :param corpus.Corpus c: :return:

info()

read the log.run file to extract the current status of the compression job :return:

run()
run_ffmpeg(ffmpeg_inputs, output_path)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.MergeCorporaJob(*args, **kwargs)

Merges Bliss Corpora files into a single file as subcorpora or flat

Parameters:
  • bliss_corpora (Iterable[Path]) – any iterable of bliss corpora file paths to merge

  • name (str) – name of the new corpus (subcorpora will keep the original names)

  • merge_strategy (MergeStrategy) – how the corpora should be merged, e.g. as subcorpora or flat

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.MergeCorpusSegmentsAndAudioJob(*args, **kwargs)

This job merges segments and audio files based on a rasr cluster map and a list of cluster_names. The cluster map should map segments to something like cluster.XXX where XXX is a natural number (starting with 1). The lines in the cluster_names file will be used as names for the recordings in the new corpus.

The job outputs a new corpus file + the corresponding audio files.

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.MergeStrategy(value)

An enumeration.

CONCATENATE = 2
FLAT = 1
SUBCORPORA = 0
class i6_core.corpus.transform.ReplaceTranscriptionFromCtmJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.corpus.transform.ShiftCorpusSegmentStartJob(*args, **kwargs)

Shifts the start time of a corpus to change the fft window offset

Parameters:
  • bliss_corpus (Path) – path to a bliss corpus file

  • corpus_name (str) – name of the new corpus

  • shift (int) – shift in seconds

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.huggingface

https://huggingface.co/docs/datasets/

class i6_core.datasets.huggingface.DownloadAndPrepareHuggingFaceDatasetJob(*args, **kwargs)

https://huggingface.co/docs/datasets/ https://huggingface.co/datasets

pip install datasets

Basically wraps datasets.load_dataset(...).save_to_disk(out_dir).

Example for Librispeech:

DownloadAndPrepareHuggingFaceDatasetJob(“librispeech_asr”, “clean”) https://github.com/huggingface/datasets/issues/4179

Parameters:
  • path – Path or name of the dataset, parameter passed to Dataset.load_dataset

  • name – Name of the dataset configuration, parameter passed to Dataset.load_dataset

  • data_files – Path(s) to the source data file(s), parameter passed to Dataset.load_dataset

  • revision – Version of the dataset script, parameter passed to Dataset.load_dataset

  • time_rqmt (float) –

  • mem_rqmt (float) –

  • cpu_rqmt (int) –

  • mini_task (bool) – the job should be run as mini_task

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.librispeech

class i6_core.datasets.librispeech.DownloadLibriSpeechCorpusJob(*args, **kwargs)

Download a part of the LibriSpeech corpus from https://www.openslr.org/resources/12 and checks for file integrity via md5sum

(see also: https://www.openslr.org/12/)

To get the corpus metadata, use DownloadLibriSpeechMetadataJob

self.out_corpus_folder links to the root of the speaker_id/chapter/* folder structure

Parameters:

corpus_key (str) – corpus identifier, e.g. “train-clean-100”

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.librispeech.DownloadLibriSpeechMetadataJob(*args, **kwargs)

Downloads the metadata file and checks for md5sum integrity

Defines outputs for “SPEAKERS.TXT, CHAPTERS.TXT and BOOKS.TXT”

Parameters:

corpus_key (str) – corpus identifier, e.g. “train-clean-100”

class i6_core.datasets.librispeech.LibriSpeechCreateBlissCorpusJob(*args, **kwargs)

Creates a Bliss corpus from a LibriSpeech corpus folder using the speaker information in addition

Outputs a single bliss .xml.gz file

Parameters:
  • corpus_folder (Path) – Path to a LibriSpeech corpus folder

  • speaker_metadata (Path) – Path to SPEAKER.TXT file from the MetdataJob (out_speakers)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.ljspeech

class i6_core.datasets.ljspeech.DownloadLJSpeechCorpusJob(*args, **kwargs)

Downloads, checks and extracts the LJSpeech corpus.

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.ljspeech.LJSpeechCreateBlissCorpusJob(*args, **kwargs)

Generate a Bliss xml from the downloaded LJspeech dataset

Parameters:
  • metadata (Path) – path to metadata.csv

  • audio_folder (Path) – path to the wavs folder

  • name – overwrite default corpus name

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.switchboard

Switchboard is conversational telephony speech with 8 Khz audio files. The training data consists of 300h hours. Reference: https://catalog.ldc.upenn.edu/LDC97S62

number of recordings: 4876 number of segments: 249624 number of speakers: 2260

class i6_core.datasets.switchboard.CreateFisherTranscriptionsJob(*args, **kwargs)

Create the compressed text data based on the fisher transcriptions which can be used for LM training

Part 1: https://catalog.ldc.upenn.edu/LDC2004T19 Part 2: https://catalog.ldc.upenn.edu/LDC2005T19

Parameters:
  • fisher_transcriptions1_folder – path to unpacked LDC2004T19.tgz, usually named fe_03_p1_tran

  • fisher_transcriptions2_folder – path to unpacked LDC2005T19.tgz, usually named fe_03_p2_tran

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e00CorpusJob(*args, **kwargs)

Creates the switchboard hub5e_00 corpus based on LDC2002S09 No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S09.tgz

  • hub5_transcriptions – extracted LDC2002T43.tgz named “2000_hub5_eng_eval_tr”

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e01CorpusJob(*args, **kwargs)

Creates the switchboard hub5e_01 corpus based on LDC2002S13

This corpus provides no glm, as the same as for Hub5e00 should be used

No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S13.tgz

  • hub5e01_folder – extracted LDC2002S13 named “hub5e_01”

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateLDCSwitchboardSpeakerListJob(*args, **kwargs)

This creates the speaker list according to the conversation and speaker table from the LDC documentation: https://catalog.ldc.upenn.edu/docs/LDC97S62

The resulting file contains 520 speakers in the format of:

<speaker_id> <gender> <recording>

Parameters:
  • caller_tab_file – caller_tab.csv from the Switchboard LDC documentation

  • conv_tab_file – conv_tab.csv from the Switchboard LDC documentation

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateRT03sCTSCorpusJob(*args, **kwargs)

Create the RT03 test set corpus, specifically the “CTS” subset of LDC2007S10

No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2007S10.tgz

  • rt03_folder – extracted LDC2007S10.tgz

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardBlissCorpusJob(*args, **kwargs)

Creates Switchboard bliss corpus xml

segment name format: sw2001B-ms98-a-<folder-name>

Parameters:
  • audio_dir (tk.Path) – path for audio data

  • trans_dir (tk.Path) – path for transcription data. see DownloadSwitchboardTranscriptionAndDictJob

  • speakers_list_file (tk.Path) –

    path to a speakers list text file with format:

    speaker_id gender recording<channel>, e.g. 1005 F 2452A

    on each line. see CreateSwitchboardSpeakersListJob job

  • skip_empty_ldc_file (bool) – In the original corpus the sequence 2167B is mostly empty, thus exclude it from training (recommended, GMM will fail otherwise)

  • lowercase (bool) – lowercase the transcriptions of the corpus (recommended)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardLexiconTextFileJob(*args, **kwargs)

This job creates SWB preprocessed dictionary text file consistent with the training corpus given a raw dictionary text file downloaded within the transcription directory using DownloadSwitchboardTranscriptionAndDictJob Job. The resulted dictionary text file will be passed as argument to LexiconFromTextFileJob job in order to create bliss xml lexicon.

Parameters:

raw_dict_file (tk.Path) – path containing the raw dictionary text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpeakersListJob(*args, **kwargs)
Given some speakers statistics info, this job creates a text file having on each line:

speaker_id gender recording

Parameters:

speakers_stats_file (tk.Path) – speakers stats text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpokenFormBlissCorpusJob(*args, **kwargs)

Creates a special spoken form version of switchboard-1 used for e.g. BPE or Sentencepiece based models. It includes:

  • make sure everything is lowercased

  • conversion of numbers to written form (using a given conversion table)

  • conversion of some short forms into spoken forms (also using the table)

  • making special tokens uppercase again

Parameters:

switchboard_bliss_corpus – out_corpus of CreateSwitchboardBlissCorpusJob

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.DownloadSwitchboardSpeakersStatsJob(*args, **kwargs)

Note that this does not contain the speaker info for all recordings. We assume later that each recording has a unique speaker and a unique id is used for those recordings with unknown speakers info

Parameters:
  • url (str) –

  • target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url

  • checksum (str|None) – A sha256 checksum to verify the file

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

class i6_core.datasets.switchboard.DownloadSwitchboardTranscriptionAndDictJob(*args, **kwargs)

Downloads switchboard training transcriptions and dictionary (or lexicon)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.SwitchboardSphereToWaveJob(*args, **kwargs)

Takes an audio folder from one of the switchboard LDC folders and converts dual channel .sph files with mulaw encoding to single channel .wav files with s16le encoding

Parameters:

sph_audio_folder

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.tedlium2

class i6_core.datasets.tedlium2.CreateTEDLIUM2BlissCorpusJob(*args, **kwargs)

Processes stm files from TEDLIUM2 corpus folders and creates Bliss corpus files Outputs a stm file and a bliss .xml.gz file for each train/dev/test set

Parameters:

{corpus_key (Dict) – Path} corpus_folders:

load_stm_data(stm_file)

:param str stm_file

make_corpus()

create bliss corpus from stm file (always include speakers)

make_stm()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.tedlium2.DownloadTEDLIUM2CorpusJob(*args, **kwargs)

Download full TED-LIUM Release 2 corpus from https://projets-lium.univ-lemans.fr/wp-content/uploads/corpus/TED-LIUM/ (all train/dev/test/LM/dictionary data included)

process_dict()

minor modification on the dictionary (see comments)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.datasets.tf_datasets

This module adds jobs for TF datasets, as documented here: https://www.tensorflow.org/datasets

class i6_core.datasets.tf_datasets.DownloadAndPrepareTfDatasetJob(*args, **kwargs)

This job downloads and prepares a TF dataset. The processed files are stored in a data_dir folder, from where it can be loaded again (see https://www.tensorflow.org/datasets/overview#load_a_dataset)

Install the dependencies:

pip install tensorflow-datasets

It further needs some extra dependencies, for example for ‘librispeech’:

pip install apache_beam pip install pydub # ffmpeg installed

See here for some more: https://github.com/tensorflow/datasets/blob/master/setup.py

Also maybe:

pip install datasets  # for Huggingface community datasets
Parameters:
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.deprecated.returnn_extract_prior

i6_core.discriminative_training.lattice_generation

i6_core.features.common

i6_core.features.common.add_derivatives(feature_net, derivatives=1)
i6_core.features.common.add_linear_transform(feature_net, matrix_path)
i6_core.features.common.basic_cache_flow(cache_files)
i6_core.features.common.cepstrum_flow(normalize=True, outputs=16, add_epsilon=False, epsilon=1.175494e-38)
i6_core.features.common.external_file_feature_flow(flow_file)
i6_core.features.common.feature_extraction_cache_flow(feature_net, port_name_mapping, one_dimensional_outputs=None)
Parameters:
  • feature_net (rasr.FlowNetwork) – feature flow to extract features from

  • port_name_mapping (dict[str,str]) – maps output ports to names of the cache files

  • one_dimensional_outputs (set[str]|None) – output ports that return one-dimensional features (e.g. energy)

Return type:

rasr.FlowNetwork

i6_core.features.common.fft_flow(preemphasis=1.0, window_type='hamming', window_shift=0.01, window_length=0.025)
i6_core.features.common.make_first_feature_energy(feature_net)
i6_core.features.common.normalize_features(feature_net, length='infinite', right='infinite', norm_type='mean-and-variance')

Add normalization of the specfified type to the feature flow :param feature_net rasr.FlowNetwork: the unnormalized flow network, must have an output named ‘features’ :param length int|str: length of the normalization window in frames (or ‘infinite’) :param right int|str: number of frames right of the current position in the normalization window (can also be ‘infinite’) :param norm_type str: type of normalization, possible values are ‘level’, ‘mean’, ‘mean-and-variance’, ‘mean-and-variance-1D’, ‘divide-by-mean’, ‘mean-norm’ :returns rasr.FlowNetwork: input FlowNetwork with a signal-normalization node before the output

i6_core.features.common.raw_audio_flow(audio_format='wav')
i6_core.features.common.samples_flow(audio_format='wav', dc_detection=True, dc_params={'max-dc-increment': 0.9, 'min-dc-length': 0.01, 'min-non-dc-segment-length': 0.021}, input_options=None, scale_input=None)

Create a flow to read samples from audio files, convert it to f32 and apply optional dc-detection.

Files that do not have a native input node will be opened with the ffmpeg flow node. Please check if scaling is needed.

Native input formats are:
  • wav

  • nist

  • flac

  • mpeg (mp3)

  • gsm

  • htk

  • phondat

  • oss

For more information see: https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Audio_Nodes

Parameters:
  • audio_format (str) – the input audio format

  • dc_detection (bool) – enable dc-detection node

  • dc_params (dict) – optional dc-detection node parameters

  • input_options (dict) – additional options for the input node

  • scale_input (int|float|None) – scale the waveform samples, this might be needed to scale ogg inputs by 2**15 to support feature flows designed for 16-bit wav inputs

Returns:

i6_core.features.common.select_features(feature_net, select_range)
i6_core.features.common.sync_energy_features(feature_net, energy_net)
i6_core.features.common.sync_features(feature_net, target_net, feature_output='features', target_output='features')

i6_core.features.energy

i6_core.features.energy.EnergyJob(crp: CommonRasrParameters, energy_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob
i6_core.features.energy.energy_flow(without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, fft_options: Optional[Dict[str, Any]] = None, normalization_type: str = 'divide-by-mean') FlowNetwork
Parameters:
  • without_samples

  • samples_options – arguments to sample_flow()

  • fft_options – arguments to fft_flow()

  • normalization_type (str) –

i6_core.features.extraction

class i6_core.features.extraction.FeatureExtractionJob(*args, **kwargs)

Runs feature extraction of a given corpus into cache files

The cache files can be accessed as bundle Path (out_feature_bundle) or as MultiOutputPath (out_feature_path)

Parameters:
  • crp (rasr.crp.CommonRasrParameters) – common RASR parameters

  • feature_flow (rasr.flow.FlowNetwork) – feature flow for feature foraging

  • port_name_mapping (dict[str,str]) – mapping between output ports (key) and name of the features (value)

  • one_dimensional_outputs (set[str]|None) – set of output ports with one dimensional features

  • job_name (str) – name used in sisyphus visualization and job folder name

  • rtf (float) – real-time-factor of the feature-extraction

  • mem (int) – memory required for the job

  • parallel (int) – maximum number of parallely running tasks

  • indirect_write (bool) – if true will write to temporary directory first before copying to output folder

  • extra_config (rasr.config.RasrConfig|None) – additional RASR config merged into the final config

  • extra_post_config (rasr.config.RasrConfig|None) – additional RASR config that will not be part of the hash

cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, feature_flow, extra_config, extra_post_config, **kwargs)
Parameters:
Returns:

config, post_config

Return type:

(rasr.config.RasrConfig, rasr.config.RasrConfig)

create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.features.filterbank

i6_core.features.filterbank.FilterbankJob(crp, filterbank_options=None, **kwargs)
Parameters:
Returns:

Feature extraction job with filterbank flow

Return type:

FeatureExtractionJob

i6_core.features.filterbank.filter_width_from_channels(channels, warping_function='mel', f_max=8000, f_min=0)
Per default we use FilterBank::stretchToCover, it computes it number of filters:

number_of_filters = (maximumFrequency_ - minimumFrequency_ - filterWidth_) / spacing_ + 1));

Parameters:
  • channels (int) – Number of channels of the filterbank

  • warping_function (str) – Warping function used by the filterbank. [‘mel’, ‘bark’]

  • f_max (float) – Filters are placed only below this frequency in Hz. The physical maximum is half of the audio sample rate, but lower values make possibly more sense.

  • f_min (float) – Filters are placed only over this frequency in Hz

Returns:

filter-width

:rtype float

i6_core.features.filterbank.filterbank_flow(warping_function='mel', filter_width=70, normalize=True, normalization_options=None, without_samples=False, samples_options=None, fft_options=None, apply_log=False, add_epsilon=False, add_features_output=False)
Parameters:
  • warping_function (str) – “mel” or “bark”

  • filter_width (int) – filter width in Hz. Please use filter_width_from_channels() to get N filters.

  • normalize (bool) – add a final signal-normalization node

  • normalization_options (dict[str, Any]|None) – option dict for signal-normalization flow node

  • without_samples (bool) – creates the flow network without a sample flow, but expects “samples” as input

  • samples_options (dict[str, Any]|None) – parameter dict for samples_flow()

  • fft_options (dict[str, Any]|None) – parameter dict for fft_flow()

  • apply_log (bool) – adds a logarithm before normalization

  • add_epsilon (bool) – if a logarithm should be applied, add a small epsilon to prohibit zeros

  • add_features_output (bool) – Add the output port “features”. This should be set to True, default is False to not break existing hash.

Returns:

filterbank flow network

Return type:

rasr.FlowNetwork

i6_core.features.gammatone

i6_core.features.gammatone.GammatoneJob(crp: CommonRasrParameters, gt_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob
i6_core.features.gammatone.gammatone_flow(minfreq: int = 100, maxfreq: int = 7500, channels: int = 68, warp_freqbreak: Optional[int] = None, tempint_type: str = 'hanning', tempint_shift: float = 0.01, tempint_length: float = 0.025, flush_before_gap: bool = True, do_specint: bool = True, specint_type: str = 'hanning', specint_shift: int = 4, specint_length: int = 9, normalize: bool = True, preemphasis: bool = True, legacy_scaling: bool = False, without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, normalization_options: Optional[Dict[str, Any]] = None, add_features_output: bool = False) FlowNetwork
Parameters:
  • minfreq

  • maxfreq

  • channels

  • warp_freqbreak

  • tempint_type

  • tempint_shift

  • tempint_length

  • flush_before_gap

  • do_specint

  • specint_type

  • specint_shift

  • specint_length

  • normalize

  • preemphasis

  • legacy_scaling

  • without_samples

  • samples_options – arguments to sample_flow()

  • normalization_options

  • add_features_output

i6_core.features.mfcc

i6_core.features.mfcc.MfccJob(crp: CommonRasrParameters, mfcc_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob
Parameters:
  • crp

  • mfcc_options – Nested parameters for mfcc_flow()

i6_core.features.mfcc.mfcc_flow(warping_function: str = 'mel', filter_width: float = 268.258, normalize: bool = True, normalization_options: Optional[Dict[str, Any]] = None, without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, fft_options: Optional[Dict[str, Any]] = None, cepstrum_options: Optional[Dict[str, Any]] = None, add_features_output: bool = False) FlowNetwork
Parameters:
  • warping_function

  • filter_width

  • normalize – whether to add or not a normalization layer

  • normalization_options

  • without_samples

  • samples_options – arguments to sample_flow()

  • fft_options – arguments to fft_flow()

  • cepstrum_options – arguments to cepstrum_flow()

  • add_features_output – Add the output port “features” when normalize is True. This should be set to True, default is False to not break existing hash.

i6_core.features.mrasta

i6_core.features.mrasta.MrastaJob(crp, mrasta_options=None, **kwargs)
i6_core.features.mrasta.mrasta_flow(temporal_size=101, temporal_right=50, derivatives=1, gauss_filters=6, warping_function='mel', filter_width=268.258, filterbank_outputs=20, samples_options={}, fft_options={})

i6_core.features.normalization

class i6_core.features.normalization.CovarianceNormalizationJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, feature_flow, extra_config_estimate=None, extra_post_config_estimate=None, extra_config_normalization=None, extra_post_config_normalization=None)
create_files()
estimate()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

normalization()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.features.plp

i6_core.features.plp.PlpJob(crp, sampling_rate, plp_options=None, **kwargs)
i6_core.features.plp.plp_flow(warping_function='bark', num_features=20, sampling_rate=8000, filter_width=3.8, normalize=True, normalization_options=None, without_samples=False, samples_options=None, fft_options=None, add_features_output=False)

i6_core.features.sil_norm

class i6_core.features.sil_norm.ExtractSegmentSilenceNormalizationMapJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.features.sil_norm.ExtractSilenceNormalizationMapJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.features.sil_norm.UnwarpTimesInCTMJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.features.sil_norm.samples_with_silence_normalization_flow(audio_format='wav', dc_detection=True, dc_params=None, silence_params=None)

i6_core.features.tone

class i6_core.features.tone.ToneJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
convert(task_id)
classmethod create_convert_config(crp, timestamp_flow, timestamp_port, extra_convert_config, extra_convert_post_config, **kwargs)
classmethod create_convert_flow(crp, timestamp_flow, timestamp_port, **kwargs)
classmethod create_dump_config(crp, samples_flow, extra_dump_config, extra_dump_post_config, **kwargs)
classmethod create_dump_flow(crp, samples_flow, **kwargs)
create_files()
dump(task_id)
extract_pitch()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.features.voiced

i6_core.features.voiced.VoicedJob(crp, voiced_options=None, **kwargs)
i6_core.features.voiced.voiced_flow(window_shift=0.01, window_duration=0.04, min_pos=0.0025, max_pos=0.0167, without_samples=False, samples_options={}, add_voiced_output=False)

i6_core.g2p.apply

class i6_core.g2p.apply.ApplyG2PModelJob(*args, **kwargs)

Apply a trained G2P on a word list file

Parameters:
  • g2p_model (Path) –

  • word_list_file (Path) – text file with a word each line

  • variants_mass (float) –

  • variants_number (int) –

  • g2p_path (Optional[Path]) –

  • g2p_python (Optional[Path]) –

  • filter_empty_words (bool) – if True, creates a new lexicon file with no empty translated words

  • concurrent (int) – split up word list file to parallelize job into this many instances

filter()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
run(task_id)
split_word_list()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.g2p.convert

class i6_core.g2p.convert.BlissLexiconToG2PLexiconJob(*args, **kwargs)

Convert a bliss lexicon into a g2p compatible lexicon for training

Parameters:
  • bliss_lexicon (Path) –

  • include_pronunciation_variants (bool) – In case of multiple phoneme representations for one lemma, when this is false it outputs only the first phoneme

  • include_orthography_variants (bool) – In case of multiple orthographic representations for one lemma, when this is false it outputs only the first orth

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.g2p.convert.G2POutputToBlissLexiconJob(*args, **kwargs)

Convert a g2p applied word list file (g2p lexicon) into a bliss lexicon

Parameters:
  • iv_bliss_lexicon (Path) – bliss lexicon as reference for the phoneme inventory

  • g2p_lexicon (Path) – from ApplyG2PModelJob.out_g2p_lexicon

  • merge (bool) – merge the g2p lexicon into the iv_bliss_lexicon instead of only taking the phoneme inventory

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.g2p.train

class i6_core.g2p.train.TrainG2PModelJob(*args, **kwargs)

Train a G2P model using Sequitur

see https://github.com/sequitur-g2p/sequitur-g2p

Parameters:
  • g2p_lexicon (Path) – g2p_lexicon for training, use BlissLexiconToG2PLexiconJob to generate a g2p_lexicon from a bliss lexicon

  • num_ramp_ups (int) – number of global ramp-ups (n-gram-iness)

  • min_iter (int) – minimum iterations per ramp-up

  • max_iter (int) – maximum iteration sper ramp-up

  • devel (str) – passed as -d argument, percent of train lexicon held out as validation set

  • size_constrains (str) – passed as -s argument, multigrams must have l1 … l2 left-symbols and r1 … r2 right-symbols

  • extra_args (list[str]) – extra cmd arguments that are passed to the g2p process

  • g2p_path (Optional[Path]) – path to the g2p installation. If None, searches for a global G2P_PATH, and uses the default binary path if not existing.

  • g2p_python (Optional[Path]) – path to the g2p python binary. If None, searches for a global G2P_PYTHON, and uses the default python binary if not existing.

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lda.config

i6_core.lda.estimate

class i6_core.lda.estimate.EstimateLDAMatrixJob(*args, **kwargs)
cleanup_before_run(*args)
classmethod create_config(crp, between_class_scatter_matrix, within_class_scatter_matrix, reduced_dimension, eigenvalue_problem_config, generalized_eigenvalue_problem_config, extra_config, extra_post_config)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lda.estimate.EstimateScatterMatricesJob(*args, **kwargs)
accumulate(task_id)
cleanup_before_run(cmd, retry, *args)
classmethod create_accumulate_config(crp, alignment_flow, extra_config_accumulate, extra_post_config_accumulate, **kwargs)
classmethod create_estimate_config(crp, extra_config_estimate, extra_post_config_estimate, **kwargs)
create_files()
classmethod create_merge_config(crp, extra_config_merge, extra_post_config_merge, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lda.flow

i6_core.lda.flow.add_context_flow(feature_net, max_size=9, right=4, margin_condition='present-not-empty', expand_timestamp=False)

i6_core.lexicon.allophones

class i6_core.lexicon.allophones.DumpStateTyingJob(*args, **kwargs)
cleanup_before_run(cmd, retry, *args)
classmethod create_config(crp, extra_config, extra_post_config, **kwargs)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.allophones.StoreAllophonesJob(*args, **kwargs)
cleanup_before_run(cmd, retry, *args)
classmethod create_config(crp, extra_config, extra_post_config, **kwargs)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lexicon.beep

class i6_core.lexicon.beep.BeepToBlissLexiconJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.beep.DownloadBeepJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lexicon.cmu

class i6_core.lexicon.cmu.CMUDictToBlissJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.cmu.DownloadCMUDictJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lexicon.conversion

class i6_core.lexicon.conversion.FilterLexiconByWordListJob(*args, **kwargs)

Filter lemmata to given word list. Warning: case_sensitive parameter does the opposite. Kept for backwards-compatibility.

Parameters:
  • bliss_lexicon (tk.Path) – lexicon file to be handeled

  • word_list (tk.Path) – filter lexicon by this word list

  • case_sensitive (bool) – filter lemmata case-sensitive. Warning: parameter does the opposite.

  • check_synt_tok (bool) – keep also lemmata where the syntactic token matches word_list

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.conversion.GraphemicLexiconFromWordListJob(*args, **kwargs)
default_transforms = {'+': 'PLUS', '.': 'DOT', '{': 'LBR', '}': 'RBR'}
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.conversion.LexiconFromTextFileJob(*args, **kwargs)

Create a bliss lexicon from a regular text file, where each line contains: <WORD> <PHONEME1> <PHONEME2> … separated by tabs or spaces. The lemmata will be added in the order they appear in the text file, the phonemes will be sorted alphabetically. Phoneme variants of the same word need to appear next to each other.

WARNING: No special lemmas or phonemes are added, so do not use this lexicon with RASR directly!

As the splitting is taken from RASR and not fully tested, it might not work in all cases so do not use this job without checking the output manually on new lexica.

Parameters:
  • text_file (Path) –

  • compressed – save as .xml.gz

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.conversion.LexiconToWordListJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.conversion.LexiconUniqueOrthJob(*args, **kwargs)

Merge lemmata with the same orthography.

Parameters:
  • bliss_lexicon (tk.Path) – lexicon file to be handeled

  • merge_multi_orths_lemmata (bool) –

    if True, also lemmata containing multiple orths are merged based on their primary orth. Otherwise they are ignored.

    Merging strategy - orth/phon/eval

    all orth/phon/eval elements are merged together

    • synt
      synt element is only copied to target lemma when
      1. the target lemma does not already have one

      2. and the rest to-be-merged-lemmata have any synt element.

      ** having a synt <=> synt is not None

      this could lead to INFORMATION LOSS if there are several different synt token sequences in the to-be-merged lemmata

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.conversion.SpellingConversionJob(*args, **kwargs)

Spelling conversion for lexicon.

Parameters:
  • bliss_lexicon (Path) – input lexicon, whose lemmata all have unique PRIMARY orth to reach the above requirements apply LexiconUniqueOrthJob

  • orth_mapping_file (str) –

    orthography mapping file: *.json *.json.gz *.txt *.gz in case of plain text file

    one can adjust mapping_delimiter a line starting with “#” is a comment line

  • mapping_file_delimiter (str) – delimiter of source and target orths in the mapping file relevant only if mapping is provided with a plain text file

:param Optional[List[Tuple[str, str, str]]] mapping_rules
a list of mapping rules, each rule is represented by 3 strings

(source orth-substring, target orth-substring, pos) where pos should be one of [“leading”, “trailing”, “any”]

e.g. the rule (“zation”, “sation”, “trailing”) will convert orth ending with -zation to orth ending with -sation set this ONLY when it’s clearly defined rules which can not generate any kind of ambiguities

Parameters:

invert_mapping (bool) – invert the input orth mapping NOTE: this also affects the pairs which are inferred from mapping_rules

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lexicon.modification

class i6_core.lexicon.modification.AddEowPhonemesToLexiconJob(*args, **kwargs)

Extends phoneme set of a lexicon by additional end-of-word (eow) versions of all regular phonemes. Modifies lemmata to use the new eow-version of the final phoneme in each pronunciation.

Parameters:
  • bliss_lexicon – Base lexicon to be modified.

  • nonword_phones – List of nonword-phones for which no eow-versions will be added, e.g. [noise]. Phonemes that occur in special lemmata are found automatically and do not need to be specified here.

  • boundary_marker – String that is appended to phoneme symbols to mark eow-version.

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.modification.MergeLexiconJob(*args, **kwargs)

Merge multiple bliss lexica into a single bliss lexicon.

Phonemes and lemmata can be individually sorted alphabetically or kept as is.

When merging a lexicon with a static lexicon, putting the static lexicon first and only sorting the phonemes will result in the “typical” lexicon structure.

Please be aware that the sorting or merging of lexica that were already used will create a new lexicon that might be incompatible to previously generated alignments.

Parameters:
  • bliss_lexica (list[Path]) – list of bliss lexicon files (plain or gz)

  • sort_phonemes (bool) – sort phoneme inventory alphabetically

  • sort_lemmata (bool) – sort lemmata alphabetically based on first orth entry

  • compressed (bool) – compress final lexicon

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lexicon.modification.WriteLexiconJob(*args, **kwargs)

Create a bliss lexicon file from a static Lexicon.

Supports optional sorting of phonemes and lemmata.

Example for a static lexicon:

Parameters:
  • static_lexicon (lexicon.Lexicon) – A Lexicon object

  • sort_phonemes (bool) – sort phoneme inventory alphabetically

  • sort_lemmata (bool) – sort lemmata alphabetically based on first orth entry

  • compressed (bool) – compress final lexicon

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lib.corpus

Helper functions and classes for Bliss xml corpus loading and writing

class i6_core.lib.corpus.Corpus

This class represents a corpus in the Bliss format. It is also used to represent subcorpora when the parent_corpus attribute is set. Corpora with include statements can be read but are written back as a single file.

add_recording(recording: Recording)
add_speaker(speaker: Speaker)
add_subcorpus(corpus: Corpus)
all_recordings() Iterable[Recording]
all_speakers() Iterable[Speaker]
dump(path: str)
Parameters:

path – target .xml or .xml.gz path

filter_segments(filter_function: Callable[[Corpus, Recording, Segment], bool])

filter all segments (including in subcorpora) using filter_function :param filter_function: takes arguments corpus, recording and segment, returns True if segment should be kept

fullname() str
load(path: str)
Parameters:

path – corpus .xml or .xml.gz

remove_recording(recording: Recording)
segments() Iterable[Segment]
Returns:

an iterator over all segments within the corpus

speaker(speaker_name: Optional[str], default_speaker: Optional[Speaker]) Speaker
top_level_recordings() Iterable[Recording]
top_level_speakers() Iterable[Speaker]
top_level_subcorpora() Iterable[Corpus]
class i6_core.lib.corpus.CorpusSection
class i6_core.lib.corpus.NamedEntity
class i6_core.lib.corpus.Recording
add_segment(segment: Segment)
dump(out: TextIO, indentation: str = '')
fullname() str
speaker(speaker_name: Optional[str] = None) Speaker
class i6_core.lib.corpus.Segment
dump(out: TextIO, indentation: str = '')
fullname() str
speaker() Speaker
class i6_core.lib.corpus.Speaker
dump(out: TextIO, indentation: str = '')

i6_core.lib.hdf

i6_core.lib.hdf.get_input_dict_from_returnn_hdf(hdf_file: File) Dict[str, ndarray]

Generate dictionary containing the “data” value as ndarray indexed by the sequence tag

Parameters:

hdf_file – HDF file to extract data from

Returns:

i6_core.lib.hdf.get_returnn_simple_hdf_writer(returnn_root: Optional[str])

Get the RETURNN SimpleHDFWriter, will add return to the path, so only use in Job runtime :param returnn_root:

i6_core.lib.lexicon

Library for the RASR Lexicon files

For format details visit: `https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Lexicon`_

class i6_core.lib.lexicon.Lemma(orth: Optional[List[str]] = None, phon: Optional[List[str]] = None, synt: Optional[List[str]] = None, eval: Optional[List[List[str]]] = None, special: Optional[str] = None)

Represents a lemma of a lexicon

Parameters:
  • orth – list of spellings used in the training data

  • phon – list of pronunciation variants. Each str should contain a space separated string of phonemes from the phoneme-inventory.

  • synt – list of LM tokens that form a single token sequence. This sequence is used as the language model representation.

  • eval – list of output representations. Each sublist should contain one possible transcription (token sequence) of this lemma that is scored against the reference transcription.

  • special – assigns special property to a lemma. Supported values: “silence”, “unknown”, “sentence-boundary”, or “sentence-begin” / “sentence-end”

classmethod from_element(e)
Parameters:

e (ET.Element) –

Return type:

Lemma

to_xml()
Returns:

xml representation

Return type:

ET.Element

class i6_core.lib.lexicon.Lexicon

Represents a bliss lexicon, can be read from and written to .xml files

add_lemma(lemma)
Parameters:

lemma (Lemma) –

add_phoneme(symbol, variation='context')
Parameters:
  • symbol (str) – representation of one phoneme

  • variation (str) – possible values: “context” or “none”. Use none for context independent phonemes like silence and noise.

load(path)
Parameters:

path (str) – bliss lexicon .xml or .xml.gz file

remove_phoneme(symbol)
Parameters:

symbol (str) –

to_xml()
Returns:

xml representation, can be used with util.write_xml

Return type:

ET.Element

i6_core.lib.lm

class i6_core.lib.lm.Lm(lm_path)

Interface to access the ngrams of an LM. Currently supports only LMs in arpa format.

Parameters:

lm_path (str) – Path to the LM file, currently supports only arpa files

get_ngrams(n)

returns all the ngrams of order n

load_arpa()
i6_core.lib.lm.not_ngrams(text: str)

i6_core.lib.rasr_cache

This module is about reading (maybe later also writing) the Rasr archive format.

class i6_core.lib.rasr_cache.AllophoneLabeling(silence_phone, allophone_file, phoneme_file=None, state_tying_file=None, verbose_out=None)

Allophone labeling.

Parameters:
  • silence_phone (str) – e.g. “si”

  • allophone_file (str) – list of allophones

  • phoneme_file (str|None) – list of phonemes

  • state_tying_file (str|None) – allophone state tying (e.g. via CART). maps each allophone state to a class label

  • verbose_out (file) – stream to dump log messages

get_label_idx(allo_idx, state_idx)
Parameters:
  • allo_idx (int) –

  • state_idx (int) –

Return type:

int

get_label_idx_by_allo_state_idx(allo_state_idx)
Parameters:

allo_state_idx (int) –

Return type:

int

class i6_core.lib.rasr_cache.FileArchive(filename, must_exists=False, encoding='ascii')

File archive.

RasrCacheHeader = 'SP_ARC1\x00'
addAttributes(filename, dim, duration)
Parameters:
  • filename (str) –

  • dim (int) –

  • duration (float) –

addFeatureCache(filename, features, times)
Parameters:
  • filename (str) –

  • features

  • times

end_recovery_tag = 1437226410
file_list()
Return type:

list[str]

finalize()

Finalize.

getState(mix)
Parameters:

mix (int) –

Returns:

(mix, state)

Return type:

(int,int)

has_entry(filename)
Parameters:

filename (str) – argument for self.read()

Returns:

True if we have this entry

read(filename, typ)
Parameters:
  • filename (str) – the entry-name in the archive

  • typ (str) – “str”, “feat” or “align”

Returns:

depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,

data is a list of features, each a numpy vector,

align is a list of (time, allophone, state), time is an int from 0 to len of align,

allophone is some int, state is e.g. in [0,1,2].

Return type:

str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]

readFileInfoTable()

Read file info table.

read_S16()
Return type:

float

read_U32()
Return type:

int

read_U8()
Return type:

int

read_bytes(l)
Return type:

bytes

read_char()
Return type:

int

read_f32()
Return type:

float

read_f64()
Return type:

float

read_packed_U32()
Return type:

int

read_str(l, enc='ascii')
Return type:

str

read_u32()
Return type:

int

read_u64()
Return type:

int

read_v(typ, size)
Parameters:
  • typ (str) – “f” for float (float32) or “d” for double (float64)

  • size (int) – number of elements to return

Returns:

numpy array of shape (size,) of dtype depending on typ

Return type:

numpy.ndarray

scanArchive()

Scan archive.

setAllophones(f)
Parameters:

f (str) – allophone filename. line-separated. will ignore lines starting with “#”

start_recovery_tag = 2857740885
writeFileInfoTable()

Write file info table.

write_U32(i)
Parameters:

i (int) –

Return type:

int

write_char(i)
Parameters:

i (int) –

Return type:

int

write_f32(i)
Parameters:

i (float) –

Return type:

int

write_f64(i)
Parameters:

i (float) –

Return type:

int

write_str(s, enc='ascii')
Parameters:

s (str) –

Return type:

int

write_u32(i)
Parameters:

i (int) –

Return type:

int

write_u64(i)
Parameters:

i (int) –

Return type:

int

class i6_core.lib.rasr_cache.FileArchiveBundle(filename, encoding='ascii')

File archive bundle.

Parameters:
  • filename (str) – .bundle file

  • encoding (str) – encoding used in the files

file_list()
Return type:

list[str]

Returns:

list of content-filenames (which can be used for self.read())

has_entry(filename)
Parameters:

filename (str) – argument for self.read()

Returns:

True if we have this entry

read(filename, typ)
Parameters:
  • filename (str) – the entry-name in the archive

  • typ (str) – “str”, “feat” or “align”

Returns:

depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,

data is a list of features, each a numpy vector,

align is a list of (time, allophone, state), time is an int from 0 to len of align,

allophone is some int, state is e.g. in [0,1,2].

Return type:

str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]

Uses FileArchive.read().

setAllophones(filename)
Parameters:

filename (str) – allophone filename

class i6_core.lib.rasr_cache.FileInfo(name, pos, size, compressed, index)

File info.

Parameters:
  • name (str) –

  • pos (int) –

  • size (int) –

  • compressed (bool|int) –

  • index (int) –

class i6_core.lib.rasr_cache.MixtureSet(filename)

Mixture set.

Parameters:

filename (str) –

getCovByIdx(idx)
Parameters:

idx (int) –

Return type:

numpy.ndarray

getMeanByIdx(idx)
Parameters:

idx (int) –

Return type:

numpy.ndarray

getNumberMixtures()
Return type:

int

read_U32()
Return type:

int

read_char()
Return type:

int

read_f32()
Return type:

float

read_f64()
Return type:

float

read_str(l, enc='ascii')
Parameters:
  • l (int) –

  • enc (str) –

Return type:

str

read_u32()
Return type:

int

read_u64()
Return type:

int

read_v(size, a)
Parameters:
  • size (int) –

  • a (array.array) –

Return type:

array.array

write(filename)
Parameters:

filename (str) –

write_U32(i)
Parameters:

i (int) –

Return type:

int

write_char(i)
Parameters:

i (int) –

Return type:

int

write_f32(i)
Parameters:

i (float) –

Return type:

int

write_f64(i)
Parameters:

i (float) –

Return type:

int

write_str(s, enc='ascii')
Parameters:
  • s (str) –

  • enc (str) –

Return type:

int

write_u32(i)
Parameters:

i (int) –

Return type:

int

write_u64(i)
Parameters:

i (int) –

Return type:

int

class i6_core.lib.rasr_cache.WordBoundaries(filename)

Word boundaries.

Parameters:

filename (str) –

read_str(l, enc='ascii')
Return type:

str

read_u16()
Return type:

int

read_u32()
Return type:

int

i6_core.lib.rasr_cache.is_rasr_cache_file(filename)
Parameters:

filename (str) – file to check. must exist

Returns:

True iff this is a rasr cache (which can be loaded with open_file_archive())

Return type:

bool

i6_core.lib.rasr_cache.open_file_archive(archive_filename, must_exists=True, encoding='ascii')
Parameters:
  • archive_filename (str) –

  • must_exists (bool) –

  • encoding (str) –

Return type:

FileArchiveBundle|FileArchive

i6_core.lm.lm_image

class i6_core.lm.lm_image.CreateLmImageJob(*args, **kwargs)

pre-compute LM image without generating global cache

cleanup_before_run(cmd, retry, *args)
classmethod create_config(crp, extra_config, extra_post_config, encoding, **kwargs)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lm.perplexity

class i6_core.lm.perplexity.ComputePerplexityJob(*args, **kwargs)
cleanup_before_run(cmd, retry, *args)
classmethod create_config(crp, text_file, encoding, renormalize, extra_config, extra_post_config, **kwargs)
create_files()
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lm.reverse_arpa

class i6_core.lm.reverse_arpa.ReverseARPALmJob(*args, **kwargs)

Create a new LM in arpa format by reverting the n-grams of an existing Arpa LM.

Parameters:

lm_path (Path) – Path to the existing arpa file

static add_missing_backoffs(words, ngrams: List[Dict[str, Tuple]])
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lm.srilm

class i6_core.lm.srilm.ComputeBestMixJob(*args, **kwargs)

Compute the best mixture weights for a combination of count LMs based on the given PPL logs

Parameters:
  • ppl_logs – List of PPL Logs to compute the weights from

  • compute_best_mix_exe – Path to srilm compute_best_mix executable

run()

Call the srilm script and extracts the different weights from the log, then relinks log to output folder

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.srilm.ComputeNgramLmJob(*args, **kwargs)

Generate count based LM with SRILM

Parameters:
  • ngram_order – Maximum n gram order

  • data – Either text file or counts file to read from, set data mode accordingly the counts file can come from the CountNgramsJob.out_counts

  • data_mode – Defines whether input format is text based or count based

  • vocab – Vocabulary file, one word per line

  • extra_ngram_args – Extra arguments for the execution call e.g. [‘-kndiscount’]

  • count_exe – Path to srilm ngram-count exe

  • mem_rqmt – Memory requirements of Job (not hashed)

  • time_rqmt – Time requirements of Job (not hashed)

  • cpu_rqmt – Amount of Cpus required for Job (not hashed)

  • fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

Example options for ngram_args: -kndiscount -interpolate -debug <int> -addsmooth <int>

class DataMode(value)

An enumeration.

COUNT = 2
TEXT = 1
compress()

executes the previously created compression script and relinks the lm from work folder to output folder

create_files()

creates bash script for lm creation and compression that will be executed in the run Task

classmethod hash(kwargs)

delete the queue requirements from the hashing

run()

executes the previously created lm script and relinks the vocabulary from work folder to output folder

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.srilm.ComputeNgramLmPerplexityJob(*args, **kwargs)

Calculate the Perplexity of a Ngram LM via SRILM

Parameters:
  • ngram_order – Maximum n gram order

  • lm – LM to evaluate

  • eval_data – Data to calculate PPL on

  • vocab – Vocabulary file

  • set_unknown_flag – sets unknown lemma

  • extra_ppl_args – Extra arguments for the execution call e.g. ‘-debug 2’

  • ngram_exe – Path to srilm ngram exe

  • mem_rqmt – Memory requirements of Job (not hashed)

  • time_rqmt – Time requirements of Job (not hashed)

  • cpu_rqmt – Amount of Cpus required for Job (not hashed)

  • fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

create_files()

creates bash script that will be executed in the run Task

get_ppl()

extracts various outputs from the ppl.log file

classmethod hash(kwargs)

delete the queue requirements from the hashing

run()

executes the previously created script and relinks the log file from work folder to output folder

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.srilm.CountNgramsJob(*args, **kwargs)

Count ngrams with SRILM

Parameters:
  • ngram_order – Maximum n gram order

  • data – Input data to be read as textfile

  • extra_count_args – Extra arguments for the execution call e.g. [‘-unk’]

  • count_exe – Path to srilm ngram-count executable

  • mem_rqmt – Memory requirements of Job (not hashed)

  • time_rqmt – Time requirements of Job (not hashed)

  • cpu_rqmt – Amount of Cpus required for Job (not hashed)

  • fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

Example options/parameters for count_args: -unk

create_files()

creates bash script that will be executed in the run Task

classmethod hash(kwargs)

delete the queue requirements from the hashing

run()

executes the previously created bash script and relinks outputs from work folder to output folder

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.srilm.InterpolateNgramLmJob(*args, **kwargs)

Uses SRILM to interpolate different LMs with previously calculated weights

Parameters:
  • ngram_lms – List of language models to interpolate, format: ARPA, compressed ARPA

  • weights – Weights of different language models, has to be same order as ngram_lms

  • ngram_order – Maximum n gram order

  • extra_interpolation_args – Additional arguments for interpolation

  • ngram_exe – Path to srilm ngram executable

  • mem_rqmt – Memory requirements of Job (not hashed)

  • time_rqmt – Time requirements of Job (not hashed)

  • cpu_rqmt – Amount of Cpus required for Job (not hashed)

  • fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

classmethod hash(parsed_args)

delete the queue requirements from the hashing

run()

delete the executable from the hashing

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.srilm.PruneLMWithHelperLMJob(*args, **kwargs)

Job that prunes the given LM with the help of a helper LM

Parameters:
  • ngram_order – Maximum n gram order

  • lm – LM to be pruned

  • prune_thresh – Pruning threshold

  • helper_lm – helper/’Katz’ LM to prune the other LM with

  • ngram_exe – Path to srilm ngram-count executable

  • mem_rqmt – Memory requirements of Job (not hashed)

  • time_rqmt – Time requirements of Job (not hashed)

  • cpu_rqmt – Amount of Cpus required for Job (not hashed)

  • fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

create_files()

creates bash script that will be executed in the run Task

classmethod hash(kwargs)

delete the queue requirements from the hashing

run()

executes the previously created script and relinks the lm from work folder to output folder

tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.lm.vocabulary

class i6_core.lm.vocabulary.LmIndexVocabulary(vocab: sisyphus.job_path.Path, vocab_size: sisyphus.job_path.Variable, unknown_token: Union[sisyphus.job_path.Variable, str])
unknown_token: Union[Variable, str]
vocab: Path
vocab_size: Variable
class i6_core.lm.vocabulary.LmIndexVocabularyFromLexiconJob(*args, **kwargs)

Computes a <word>: <index> vocabulary file from a bliss lexicon for Word-Level LM training

Sentence begin/end will point to index 0, unknown to index 1. Both are taking directly from the lexicon via the “special” marking:

  • <lemma special=”sentence-begin”> -> index 0

  • <lemma special=”sentence-end”> -> index 0

  • <lemma special=”unknown”> -> index 1

If <synt> tokens are provided in a lemma, they will be used instead of <orth>

CAUTION: Be aware of: https://github.com/rwth-i6/returnn/issues/1245 when using Returnn’s LmDataset

Parameters:
  • bliss_lexicon – us the lemmas from the lexicon to define the indices

  • count_ordering_text – optional text that can be used to define the index order based on the lemma count

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.lm.vocabulary.VocabularyFromLmJob(*args, **kwargs)

Extract the vocabulary from an existing LM. Currently supports only arpa files for input.

Parameters:

lm_file (Path) – path to the lm arpa file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.meta.cart_lda

i6_core.meta.mm_sequence

i6_core.meta.system

i6_core.meta.warping_sequence

i6_core.mm.alignment

class i6_core.mm.alignment.AMScoresFromAlignmentLogJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.mm.alignment.AlignmentJob(*args, **kwargs)
Parameters:
  • crp (rasr.crp.CommonRasrParameters) –

  • feature_flow

  • feature_scorer (rasr.FeatureScorer) –

  • alignment_options (dict[str]) –

  • word_boundaries (bool) –

  • use_gpu (bool) –

  • rtf (float) –

  • extra_config

  • extra_post_config

cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, feature_flow, feature_scorer, alignment_options, word_boundaries, extra_config, extra_post_config, **kwargs)
Parameters:
  • crp (rasr.crp.CommonRasrParameters) –

  • feature_flow

  • feature_scorer (rasr.FeatureScorer) –

  • alignment_options (dict[str]) –

  • word_boundaries (bool) –

  • extra_config

  • extra_post_config

Returns:

config, post_config

Return type:

(rasr.RasrConfig, rasr.RasrConfig)

create_files()
classmethod create_flow(feature_flow, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.mm.alignment.DumpAlignmentJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, extra_config, extra_post_config, **kwargs)
create_files()
classmethod create_flow(feature_flow, original_alignment, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.mm.confidence_based_alignment

class i6_core.mm.confidence_based_alignment.ConfidenceBasedAlignmentJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, feature_flow, feature_scorer, lattice_cache, global_scale, confidence_threshold, weight_scale, ref_alignment_path, extra_config, extra_post_config, **kwargs)
create_files()
classmethod create_flow(feature_flow, lattice_cache, global_scale, confidence_threshold, weight_scale, ref_alignment_path, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.mm.flow

i6_core.mm.flow.alignment_flow(feature_net, alignment_cache_path=None)
i6_core.mm.flow.cached_alignment_flow(feature_net, alignment_cache_path)
i6_core.mm.flow.confidence_based_alignment_flow(feature_net, lattice_cache_path, alignment_cache_path=None, global_scale=1.0, confidence_threshold=0.75, weight_scale=1.0, ref_alignment_path=None)
i6_core.mm.flow.dump_alignment_flow(feature_net, original_alignment, new_alignment)
i6_core.mm.flow.linear_segmentation_flow(feature_energy_net, alignment_cache=None)

i6_core.mm.mixtures

i6_core.mm.tdp

i6_core.rasr.command

class i6_core.rasr.command.RasrCommand

Mixin for Job.

NO_RETRY_AFTER_TIME = 600.0
RETRY_WAIT_TIME = 5.0
cleanup_before_run(cmd, retry, task_id, *args)
classmethod default_exe(exe_name)

Extract executable path from the global sisyphus settings

Parameters:

exe_name (str) –

Return type:

str

classmethod get_rasr_exe(exe_name, rasr_root, rasr_arch)
Parameters:
  • exe_name (str) –

  • rasr_root (str) –

  • rasr_arch (str) –

Returns:

path to a rasr binary with the default path pattern inside the repsoitory

Return type:

str

log_file_output_path(name, crp, parallel)
Parameters:
Return type:

Path

run_cmd(cmd: str, args: Optional[List[str]] = None, retries: int = 2, cwd: Optional[str] = None)
Parameters:
  • cmd

  • args

  • retries

  • cwd – execute cmd in this dir

run_script(task_id: int, log_file: ~typing.Union[str, <sisyphus.toolkit.RelPath object at 0x7f1e203dd110>], cmd: str = './run.sh', args: ~typing.Optional[~typing.List] = None, retries: int = 2, use_tmp_dir: bool = False, copy_tmp_ls: ~typing.Optional[~typing.List] = None)
classmethod select_exe(specific_exe, default_exe_name)
Parameters:
  • specific_exe (str|None) –

  • default_exe_name (str) –

Returns:

path to exe

Return type:

str

static write_config(config, post_config, filename)
Parameters:
  • config (rasr.RasrConfig) –

  • post_config (rasr.RasrConfig) –

  • filename (str) –

static write_run_script(exe, config, filename='run.sh', extra_code='', extra_args='')
Parameters:
  • exe (str) –

  • config (str) –

  • filename (str) –

  • extra_code (str) –

  • extra_args (str) –

i6_core.rasr.config

class i6_core.rasr.config.ConfigBuilder(defaults)
class i6_core.rasr.config.RasrConfig(prolog='', prolog_hash='', epilog='', epilog_hash='')

Used to store a Rasr configuration

Parameters:
  • prolog (string) – A string that should be pasted as code at the beginning of the config file

  • epilog (string) – A string that should be pasted as code at the end of the config file

  • prolog_hash (string) – sets a specific hash for the prolog

  • epilog_hash (string) – sets a specific hash for the epilog

html()
class i6_core.rasr.config.StringWrapper(string, hidden=None)

Deprecated, please use e.g. DelayedFormat directly from Sisyphus Example for wrapping commands:

command = DelayedFormat(“{} -a -l en -no-escape”, tokenizer_binary)

Example for wrapping/combining paths:

pymod_config = DelayedFormat(“epoch:{},action:forward,configfile:{}”, model.epoch, model.returnn_config_file)

Example for wrapping even function calls:

def cut_ending(path):

return path[: -len(“.meta”)]

def foo():

[…] config.loader.saved_model_file = DelayedFunction(returnn_model.model, cut_ending)

Parameters:
  • string (str) – some string based on the hashing object

  • hidden (Any) – hashing object

class i6_core.rasr.config.WriteRasrConfigJob(*args, **kwargs)

Write a RasrConfig object into a .config file

Parameters:
  • config (RasrConfig) – RASR config part that is hashed

  • post_config (RasrConfig) – RASR config part that is not hashed

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.rasr.config.build_config_from_mapping(crp, mapping, include_log_config=True, parallelize=False)
Parameters:
Returns:

config, post_config

Return type:

(RasrConfig, RasrConfig)

i6_core.rasr.crp

class i6_core.rasr.crp.CommonRasrParameters(base=None)

This class holds often used parameters for Rasr.

Parameters:

base (CommonRasrParameters|None) –

html()
set_executables(rasr_binary_path, rasr_arch='linux-x86_64-standard')

Set all executables to a specific binary folder path

Parameters:
  • rasr_binary_path (tk.Path) – path to the rasr binary folder

  • rasr_arch (str) – RASR compile architecture suffix

Returns:

i6_core.rasr.crp.crp_add_default_output(crp, compress=False, append=False, unbuffered=False, compress_after_run=True)
Parameters:
  • crp (CommonRasrParameters) –

  • compress (bool) –

  • append (bool) –

  • unbuffered (bool) –

  • compress_after_run (bool) –

i6_core.rasr.crp.crp_set_corpus(crp, corpus)
Parameters:
  • crp (CommonRasrParameters) –

  • corpus (meta.CorpusObject) – object with corpus_file, audio_dir, audio_format, duration

i6_core.rasr.feature_scorer

class i6_core.rasr.feature_scorer.DiagonalMaximumScorer(*args, **kwargs)
class i6_core.rasr.feature_scorer.FeatureScorer
apply_config(path, config, post_config)
html()
class i6_core.rasr.feature_scorer.GMMFeatureScorer(mixtures, scale=1.0)
class i6_core.rasr.feature_scorer.InvAlignmentPassThroughFeatureScorer(prior_mixtures, max_segment_length, mapping, priori_scale=0.0)
class i6_core.rasr.feature_scorer.OnnxFeatureScorer(*, mixtures: Path, model: Path, io_map: Dict[str, str], label_log_posterior_scale: float = 1.0, label_prior_scale: float, label_log_prior_file: Optional[Path] = None, apply_log_on_output: bool = False, negate_output: bool = True, intra_op_threads: int = 1, inter_op_threads: int = 1, **kwargs)
Parameters:
  • mixtures – path to a *.mix file e.g. output of either EstimateMixturesJob or CreateDummyMixturesJob

  • model – path of a model e.g. output of ExportPyTorchModelToOnnxJob

  • io_map – mapping between internal rasr identifiers and the model related input/output. Default key values are “features” and “output”, and optionally “features-size”, e.g. io_map = {“features”: “data”, “output”: “classes”}

  • label_log_posterior_scale – scales for the log probability of a label e.g. 1.0 is recommended

  • label_prior_scale – scale for the prior log probability of a label reasonable e.g. values in [0.1, 0.7] interval

  • label_log_prior_file – xml file containing log prior probabilities e.g. estimated from the model via povey method

  • apply_log_on_output – whether to apply the log-function on the output, usefull if the model outputs softmax instead of log-softmax

  • negate_output – whether negate output (because the model outputs log softmax and not negative log softmax

  • intra_op_threads – Onnxruntime session’s number of parallel threads within each operator

  • inter_op_threads – Onnxruntime session’s number of parallel threads between operators used only for parallel execution mode

class i6_core.rasr.feature_scorer.PrecomputedHybridFeatureScorer(prior_mixtures, scale=1.0, priori_scale=0.0, prior_file=None)
class i6_core.rasr.feature_scorer.PreselectionBatchIntScorer(*args, **kwargs)
class i6_core.rasr.feature_scorer.ReturnnScorer(feature_dimension, output_dimension, prior_mixtures, model, mixture_scale=1.0, prior_scale=1.0, prior_file=None, returnn_root=None)
class i6_core.rasr.feature_scorer.SimdDiagonalMaximumScorer(*args, **kwargs)

i6_core.rasr.flow

class i6_core.rasr.flow.FlagDependentFlowAttribute(flag, alternatives)
get(net)
class i6_core.rasr.flow.FlowNetwork(name='network')
Parameters:

name (str) –

add_flags(flags)
add_hidden_input(input)

in case a Path has to be converted to a string that is then added to the network

add_input(name)
add_net(net)
add_node(filter, name, attr=None, **kwargs)
add_output(name)
add_param(name)
apply_config(path, config, post_config)
contains_filter(filter_name)
default_flags = {}
get_input_ports()
get_node_names_by_filter(filter_name)
Parameters:

output_port (str) –

Returns:

list of from_name

Return type:

list[str]

get_output_ports()
interconnect(a, node_mapping_a, b, node_mapping_b, mapping=None)

assuming a and b are FlowNetworks that have already been added to this net, the outputs of a are linked to the inputs of b, optionally a mapping between the ports can be specified

interconnect_inputs(net, node_mapping, mapping=None)

assuming net has been added to self, link all of self’s inputs to net’s inputs, optionally a mapping between the ports can be specified

interconnect_outputs(net, node_mapping, mapping=None)

assuming net has been added to self, link all of net’s outputs to self’s outputs, optionally a mapping between the ports can be specified

Parameters:
  • from_name (str) –

  • to_name (str) –

remove_node(name)
subnet_from_node(node_name)

creates a new net where only nodes that follow the given node are retained. nodes before the specified node are not included. links between one retained node and one not retained one are returned aswell. this function is usefull if a part of a net should be duplicated without copying the other part

unique_name(name)
Parameters:

name (str) –

Return type:

str

write_to_file(file)
class i6_core.rasr.flow.NamedFlowAttribute(name, value)
get(net)
class i6_core.rasr.flow.NodeMapping(mapping)
Parameters:

mapping (dict) –

class i6_core.rasr.flow.PathWithPrefixFlowAttribute(prefix, path)
get(net)
class i6_core.rasr.flow.WriteFlowNetworkJob(*args, **kwargs)

Writes a flow network to a file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.rasr.util

class i6_core.rasr.util.ClusterMapToSegmentListJob(*args, **kwargs)

Creates segment files in relation to a speaker cluster map

WARNING: This job has broken (non-portable) hashes and is not really useful anyway,

please use this only for existing pipelines

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.rasr.util.MapSegmentsWithBundlesJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.rasr.util.RemapSegmentsJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.rasr.util.RemapSegmentsWithBundlesJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.recognition.cn_decoding

class i6_core.recognition.cn_decoding.CNDecodingJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, lattice_path, lm_scale, pron_scale, write_cn, extra_config, extra_post_config)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.recognition.conversion

class i6_core.recognition.conversion.LatticeToCtmJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, lattice_cache, parallelize, encoding, fill_empty_segments, best_path_algo, extra_config, extra_post_config, **kwargs)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.recognition.optimize_parameters

class i6_core.recognition.optimize_parameters.OptimizeAMandLMScaleJob(*args, **kwargs)
cleanup_before_run(cmd, retry, *args)
classmethod create_config(crp, lattice_cache, extra_config, extra_post_config, **kwargs)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.recognition.prune

class i6_core.recognition.prune.LatticePruningJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, lattice_path, pruning_threshold, phone_coverage, nonword_phones, max_arcs_per_second, max_arcs_per_segment, output_format, pronunciation_scale, extra_config, extra_post_config)
create_files()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.recognition.scoring

class i6_core.recognition.scoring.AnalogJob(*args, **kwargs)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.recognition.scoring.Hub5ScoreJob(*args, **kwargs)
Parameters:
  • ref – reference stm text file

  • glm – text file containing mapping rules for scoring

  • hyp – hypothesis ctm text file

  • sctk_binary_path – set an explicit binary path.

calc_wer()
run(move_files=True)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.recognition.scoring.KaldiScorerJob(*args, **kwargs)

Applies the Kaldi compute-wer binary. Required gs.KALDI_PATH to be the path to the Kaldi bin folder.

Parameters:
  • ref – Path to corpus file. This job will generate reference from it.

  • hyp – Path to CTM file. It will be converted to Kaldi format in this Job.

  • map – Dictionary with words to be replaced in hyp. Example: {‘[NOISE]’ : ‘’}

  • regex – String with groups used for regex the segment names. WER will be calculated for each group individually. Example: ‘.*(S..)(P..).*’

calc_wer()
run(report_path=None, ref_path=None, hyp_path=None)
run_regex()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.recognition.scoring.QuaeroScorerJob(*args, **kwargs)
calc_wer()
run(move_files=True)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.recognition.scoring.ScliteJob(*args, **kwargs)

Run the Sclite scorer from the SCTK toolkit

Outputs:
  • out_report_dir: contains the report files with detailed scoring information

  • out_*: the job also outputs many variables, please look in the init code for a list

Parameters:
  • ref – reference stm text file

  • hyp – hypothesis ctm text file

  • cer – compute character error rate

  • sort_files – sort ctm and stm before scoring

  • additional_args – additional command line arguments passed to the Sclite binary call

  • sctk_binary_path – set an explicit binary path.

  • precision_ndigit – number of digits after decimal point for the precision of the percentages in the output variables. If None, no rounding is done. In sclite, the precision was always one digit after the decimal point (https://github.com/usnistgov/SCTK/blob/f48376a203ab17f/src/sclite/sc_dtl.c#L343), thus we recalculate the percentages here.

calc_wer()
run(output_to_report_dir=True)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.report.report

class i6_core.report.report.GenerateReportStringJob(*args, **kwargs)

Job to generate and output a report string

Parameters:
  • report_values – Can be either directly callable or a dict which then is handled by report_template

  • report_template – Function to handle report_values of type _Report_Type

  • compress – Whether to zip the report

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.report.report.MailJob(*args, **kwargs)

Job that sends a mail upon completion of an output

Parameters:
  • result – graph output that triggers sending the mail

  • subject – Subject of the mail

  • mail_address – Mail address of recipient (default: user)

  • send_contents – send the contents of result in body of the mail

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.returnn.compile

class i6_core.returnn.compile.CompileNativeOpJob(*args, **kwargs)

Compile a RETURNN native op into a shared object file.

Parameters:
  • native_op (str) – Name of the native op to compile (e.g. NativeLstm2)

  • returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)

  • returnn_root (Optional[Path]) – file path to the RETURNN repository root folder

  • search_numpy_blas (bool) – search for blas lib in numpy’s .libs folder

  • blas_lib (Path|str) – explicit path to the blas library to use

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.compile.CompileTFGraphJob(*args, **kwargs)

This Job is a wrapper around the RETURNN tool compile_tf_graph.py

Parameters:
  • returnn_config (ReturnnConfig|Path|str) – Path to a RETURNN config file

  • train (int) –

  • eval (int) –

  • search (int) –

  • epoch (int|tk.Variable|None) – compile a specific epoch for networks that might change with every epoch

  • log_verbosity (int) – RETURNN log verbosity from 1 (least verbose) to 5 (most verbose)

  • device (str|None) – optimize graph for cpu or gpu. If None, defaults to cpu for current RETURNN. For any RETURNN version before cd4bc382, the behavior will depend on the device entry in the returnn_conig, or on the availability of a GPU on the execution host if not defined at all.

  • summaries_tensor_name

  • output_format (str) – graph output format, one of [“pb”, “pbtxt”, “meta”, “metatxt”]

  • returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)

  • returnn_root (Optional[Path]) – file path to the RETURNN repository root folder

  • rec_step_by_step (Optional[str]) – name of rec layer for step-by-step graph

  • rec_json_info (bool) – whether to enable rec json info for step-by-step graph compilation

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.compile.TorchOnnxExportJob(*args, **kwargs)

Export an ONNX model using the appropriate RETURNN tool script.

Currently only supports PyTorch via tools/torch_export_to_onnx.py

Parameters:
  • returnn_config – RETURNN config object

  • checkpoint – Path to the checkpoint for export

  • device – target device for graph creation

  • returnn_python_exe – file path to the executable for running returnn (python binary or .sh)

  • returnn_root – file path to the RETURNN repository root folder

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.returnn.config

class i6_core.returnn.config.CodeWrapper(code)

Can be used to insert direct “code” (not as quoted string) into the config dict

class i6_core.returnn.config.ReturnnConfig(config, post_config=None, staged_network_dict=None, *, python_prolog=None, python_prolog_hash=None, python_epilog='', python_epilog_hash=None, hash_full_python_code=False, sort_config=True, pprint_kwargs=None, black_formatting=True)

An object that manages a RETURNN config.

It can be used to serialize python functions and class definitions directly from Sisyphus code and paste them into the RETURNN config file.

Parameters:
  • config (dict) – dictionary of the RETURNN config variables that are hashed

  • post_config (dict) – dictionary of the RETURNN config variables that are not hashed

  • staged_network_dict (None|dict[int, str|dict[str, Any]]) – dictionary of network dictionaries or any string that defines a network with network = … (e.g. the return variable of get_ext_net_dict_py_code_str() from returnn_common), indexed by the desired starting epoch of the network stage. By enabling this, an additional “networks” folder will be created next to the config location.

  • python_prolog (None|str|Callable|Class|tuple|list|dict) – str or structure containing str/callables/classes that should be pasted as code at the beginning of the config file

  • python_prolog_hash (None|Any) – set a specific hash (str) or any type of hashable objects to overwrite the default hashing for python_prolog

  • python_epilog (None|str|Callable|Class|tuple|list|dict) – str or structure containing str/callables/classes that should be pasted as code at the end of the config file

  • python_epilog_hash (None|Any) – set a specific hash (str) or any type of hashable objects to overwrite the default hashing for python_epilog

  • hash_full_python_code (bool) – By default, function bodies are not hashed. If set to True, the full content of python pro-/epilog is parsed and hashed.

  • sort_config (bool) – If set to True, the dictionary part of the config is sorted by key

  • pprint_kwargs (dict|None) – kwargs for pprint, e.g. {“sort_dicts”: False} to print dicts in given order for python >= 3.8

  • black_formatting (bool) – if true, the written config will be formatted with black

GET_NETWORK_CODE = 'import os\nimport sys\nsys.path.insert(0, os.path.dirname(__file__))\n\ndef get_network(epoch, **kwargs):\n  from networks import networks_dict\n  for epoch_ in sorted(networks_dict.keys(), reverse=True):\n    if epoch_ <= epoch:\n      return networks_dict[epoch_]\n  assert False, "Error, no networks found"\n\n'
PYTHON_CODE = '#!rnn.py\n\n${SUPPORT_CODE}\n\n${PROLOG}\n\n${REGULAR_CONFIG}\n\nlocals().update(**config)\n\n${EPILOG}\n'
check_consistency()

Check that there is no config key overwritten by post_config. Also check for parameters that should never be hashed.

get(key, default=None)
update(other)
updates a ReturnnConfig with an other ReturnnConfig:
  • config, post_config, and pprint_kwargs use dict.update

  • prolog, epilog, and hashes are concatenated

  • staged_network_dict, sort_config, and black_formatting are overwritten

Parameters:

other (ReturnnConfig) –

write(path)
Parameters:

path (str) –

class i6_core.returnn.config.WriteReturnnConfigJob(*args, **kwargs)

Writes a ReturnnConfig into a .config file

Parameters:

returnn_config (ReturnnConfig) –

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.returnn.dataset

i6_core.returnn.extract_prior

i6_core.returnn.flow

i6_core.returnn.forward

i6_core.returnn.hdf

i6_core.returnn.oggzip

i6_core.returnn.rasr_training

i6_core.returnn.training

class i6_core.returnn.training.AverageTFCheckpointsJob(*args, **kwargs)

Compute the average of multiple specified Tensorflow checkpoints using the tf_avg_checkpoints script from Returnn

Parameters:
  • model_dir – model dir from ReturnnTrainingJob

  • epochs – manually specified epochs or out_epoch from GetBestEpochJob

  • returnn_python_exe – file path to the executable for running returnn (python binary or .sh)

  • returnn_root – file path to the RETURNN repository root folder

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.training.Checkpoint(index_path)

Checkpoint object which holds the (Tensorflow) index file path as tk.Path, and will return the checkpoint path as common prefix of the .index/.meta/.data[…]

A checkpoint object should directly assigned to a RasrConfig entry (do not call .ckpt_path) so that the hash will resolve correctly

Parameters:

index_path (Path) –

property ckpt_path
exists()
class i6_core.returnn.training.GetBestEpochJob(*args, **kwargs)

Provided a RETURNN model directory and a score key, finds the best epoch. The sorting is lower=better, so to access the model with the highest values use negative index values (e.g. -1 for the model with the highest score, error or “loss”)

Parameters:
  • model_dir – model_dir output from a RETURNNTrainingJob

  • learning_rates – learning_rates output from a RETURNNTrainingJob

  • key – a key from the learning rate file that is used to sort the models, e.g. “dev_score_output/output_prob”

  • index – index of the sorted list to access, 0 for the lowest, -1 for the highest score/error/loss

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.training.GetBestPtCheckpointJob(*args, **kwargs)

Analog to GetBestTFCheckpointJob, just for torch checkpoints.

Parameters:
  • model_dir (Path) – model_dir output from a ReturnnTrainingJob

  • learning_rates (Path) – learning_rates output from a ReturnnTrainingJob

  • key (str) – a key from the learning rate file that is used to sort the models e.g. “dev_score_output/output_prob”

  • index (int) – index of the sorted list to access, 0 for the lowest, -1 for the highest score

run()
class i6_core.returnn.training.GetBestTFCheckpointJob(*args, **kwargs)

Returns the best checkpoint given a training model dir and a learning-rates file The best checkpoint will be HARD-linked if possible, so that no space is wasted but also the model not deleted in case that the training folder is removed.

Parameters:
  • model_dir (Path) – model_dir output from a RETURNNTrainingJob

  • learning_rates (Path) – learning_rates output from a RETURNNTrainingJob

  • key (str) – a key from the learning rate file that is used to sort the models e.g. “dev_score_output/output_prob”

  • index (int) – index of the sorted list to access, 0 for the lowest, -1 for the highest score

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.training.PtCheckpoint(path: Path)

Checkpoint object pointing to a PyTorch checkpoint .pt file

Parameters:

path – .pt file

exists()
class i6_core.returnn.training.ReturnnModel(returnn_config_file, model, epoch)

Defines a RETURNN model as config, checkpoint meta file and epoch

This is deprecated, use Checkpoint instead.

Parameters:
  • returnn_config_file (Path) – Path to a returnn config file

  • model (Path) – Path to a RETURNN checkpoint (only the .meta for Tensorflow)

  • epoch (int) –

class i6_core.returnn.training.ReturnnTrainingFromFileJob(*args, **kwargs)

The Job allows to directly execute returnn config files. The config files have to have the line ext_model = config.value(“ext_model”, None) and model = ext_model to correctly set the model path

If the learning rate file should be available, add ext_learning_rate_file = config.value(“ext_learning_rate_file”, None) and learning_rate_file = ext_learning_rate_file

Other externally controllable parameters may also defined in the same way, and can be set by providing the parameter value in the parameter_dict. The “ext_” prefix is used for naming convention only, but should be used for all external parameters to clearly mark them instead of simply overwriting any normal parameter.

Also make sure that task=”train” is set.

Parameters:
  • returnn_config_file (tk.Path|str) – a returnn training config file

  • parameter_dict (dict) – provide external parameters to the rnn.py call

  • time_rqmt (int|str) –

  • mem_rqmt (int|str) –

  • returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)

  • returnn_root (Optional[Path]) – file path to the RETURNN repository root folder

create_files()
get_parameter_list()
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

path_available(path)

Returns True if given path is available yet

Parameters:

path – path to check

Returns:

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.returnn.training.ReturnnTrainingJob(*args, **kwargs)

Train a RETURNN model using the rnn.py entry point.

Only returnn_config, returnn_python_exe and returnn_root influence the hash.

The outputs provided are:

  • out_returnn_config_file: the finalized Returnn config which is used for the rnn.py call

  • out_learning_rates: the file containing the learning rates and training scores (e.g. use to select the best checkpoint or generate plots)

  • out_model_dir: the model directory, which can be used in succeeding jobs to select certain models or do combinations

    note that the model dir is DIRECTLY AVAILABLE when the job starts running, so jobs that do not have other conditions need to implement an “update” method to check if the required checkpoints are already existing

  • out_checkpoints: a dictionary containing all created checkpoints. Note that when using the automatic checkpoint cleaning

    function of Returnn not all checkpoints are actually available.

Parameters:
  • returnn_config

  • log_verbosity – RETURNN log verbosity from 1 (least verbose) to 5 (most verbose)

  • device – “cpu” or “gpu”

  • num_epochs – number of epochs to run, will also set num_epochs in the config file. Note that this value is NOT HASHED, so that this number can be increased to continue the training.

  • save_interval – save a checkpoint each n-th epoch

  • keep_epochs – specify which checkpoints are kept, use None for the RETURNN default This will also limit the available output checkpoints to those defined. If you want to specify the keep behavior without this limitation, provide cleanup_old_models/keep in the post-config and use None here.

  • time_rqmt

  • mem_rqmt

  • cpu_rqmt

  • horovod_num_processes – If used without multi_node_slots, then single node, otherwise multi node.

  • multi_node_slots – multi-node multi-GPU training. See Sisyphus rqmt documentation. Currently only with Horovod, and horovod_num_processes should be set as well, usually to the same value. See https://returnn.readthedocs.io/en/latest/advanced/multi_gpu.html.

  • returnn_python_exe – file path to the executable for running returnn (python binary or .sh)

  • returnn_root – file path to the RETURNN repository root folder

check_blacklisted_parameters(returnn_config)

Check for parameters that should not be set in the config directly

Parameters:

returnn_config (ReturnnConfig) –

Returns:

create_files()
classmethod create_returnn_config(returnn_config, log_verbosity, device, num_epochs, save_interval, keep_epochs, horovod_num_processes, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

info()

Returns information about the currently running job to be displayed on the web interface and the manager view :return: string to be displayed or None if not available :rtype: str

path_available(path)

Returns True if given path is available yet

Parameters:

path – path to check

Returns:

plot()
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.returnn.vocabulary

i6_core.sat.clustering

class i6_core.sat.clustering.BayesianInformationClusteringJob(*args, **kwargs)

Generate a coprus-key-map based on the Bayesian information criterion. Each concurrent is clustered independently.

classmethod create_config(crp, feature_flow, extra_config, extra_post_config, **kwargs)
create_files()
classmethod create_flow(feature_flow, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

merge()
run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.sat.flow

i6_core.sat.flow.add_cmllr_transform(feature_net: FlowNetwork, map_file: Path, transform_dir: Path, matrix_name: str = '$input(corpus-key).matrix') FlowNetwork
Parameters:
  • feature_net – flow network for feature extraction, e.g. one from i6_core.features

  • map_file – RASR corpus-key-map file, e.g. out_cluster_map_file from SegmentCorpusBySpeakerJob

  • transform_dir – Directory containing the transformation matrix files, e.g. EstimateCMLLRJob.out_transforms

  • matrix_name – Name pattern for the matrix files in the transform_dir

Returns:

A new flow network with the CMLLR transformation added

i6_core.sat.flow.segment_clustering_flow(feature_flow=None, file='cluster.map.$(TASK)', minframes=0, mincluster=2, maxcluster=100000, threshold=0, _lambda=1, minalpha=0.4, maxalpha=0.6, alpha=-1, amalgamation=0, infile=None, **kwargs)
Parameters:
  • feature_flow – Flownetwork of features used for clustering

  • file – Name of the cluster outputfile

  • minframes – minimum number of frames in a segment to consider the segment for clustering

  • mincluster – minimum number of clusters

  • maxcluster – maximum number of clusters

  • threshold – Threshold for BIC which is added to the model-complexity based penalty

  • _lambda – Weight for the model-complexity-based penalty (only lambda=1 corresponds to the definition of BIC; decreasing lambda increases the number of segment clusters.

  • minalpha – Minimum Alpha scaling value used within distance scaling optimization

  • maxalpha – Maximum Alpha scaling value used within distance scaling optimization

  • alpha – Weighting Factor for correlation-based distance (default is automatic alpha estimation using minalpha and maxalpha values)

  • amalgamation – Amalgamation Rule 1=Max Linkage, 0=Concatenation

  • infile – Name of inputfile of clusters

Returns:

(FlowNetwork)

i6_core.sat.training

class i6_core.sat.training.EstimateCMLLRJob(*args, **kwargs)
cleanup_before_run(cmd, retry, task_id, *args)
classmethod create_config(crp, feature_flow, mixtures, alignment, cluster_map, estimation_iter, min_observation_weight, optimization_criterion, extra_config, extra_post_config, **kwargs)
create_files()
classmethod create_flow(feature_flow, alignment, **kwargs)
classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

move_transforms()
run(task_id)
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.summary.wer

class i6_core.summary.wer.KaldiSummaryJob(*args, **kwargs)
Parameters:
  • (dict) (data) – contains strings at keys data[(col, row)]

  • (str) (header) – header of the first column

  • ([str]) (col_names) – list of columns in order of appearance

  • row_names([str]) – list of rows in order of appearance

  • sort_rows(bool) – if true, the rows will be sorted alphanumerically

  • sort_cols(bool) – if true, the columns will be sorted alphanumerically

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

static wer(path)
class i6_core.summary.wer.PrintTableJob(*args, **kwargs)
Parameters:
  • (dict) (data) – contains strings at keys data[(col, row)]

  • (str) (header) – header of the first column

  • ([str]) (col_names) – list of columns in order of appearance

  • row_names([str]) – list of rows in order of appearance

  • sort_rows(bool) – if true, the rows will be sorted alphanumerically

  • sort_cols(bool) – if true, the columns will be sorted alphanumerically

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.summary.wer.ScliteLurSummaryJob(*args, **kwargs)

Prints a table containing all sclite lur results :param data: {name:str , report_dir:str}

dict2table(dicts)

Gets a list of dictionarys and creates a table :param dicts: [{name : str , data : {col:float, col:float….} }, … ] :return:

parse_lur(file_path)
run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.summary.wer.ScliteSummaryJob(*args, **kwargs)
Parameters:
  • (dict) (data) – contains strings at keys data[(col, row)]

  • (str) (header) – header of the first column

  • ([str]) (col_names) – list of columns in order of appearance

  • row_names([str]) – list of rows in order of appearance

  • sort_rows(bool) – if true, the rows will be sorted alphanumerically

  • sort_cols(bool) – if true, the columns will be sorted alphanumerically

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

static wer(path)
class i6_core.summary.wer.TableReport(header, precision=2)
add_entry(col, row, var)
update_names()

i6_core.tests.job_tests.corpus.test_convert

i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm()
i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_non_speech()
i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_none()
i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_punctuation()
i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_whitespace()

i6_core.tests.job_tests.rasr.test_config

i6_core.tests.job_tests.rasr.test_config.test_write_rasr_config()

This test can be used to test the writing of different variable types into a rasr config and check for correct serialization. Only dummy example for now.

i6_core.tests.job_tests.rasr.test_flow

i6_core.tests.job_tests.rasr.test_flow.test_deterministic_flow_serialization()

Check if the RASR flow network is serialized in a deterministic way by running multiple serializations of slightly different flow networks. Serialization used to be non-deterministic over different python interpreter runs.

i6_core.tests.job_tests.recognition.test_scoring

i6_core.tests.job_tests.recognition.test_scoring.compile_sctk(branch: Optional[str] = None, commit: Optional[str] = None, sctk_git_repository: str = 'https://github.com/usnistgov/SCTK.git') Path
Parameters:
  • branch – specify a specific branch

  • commit – specify a specific commit

  • sctk_git_repository – where to clone SCTK from, usually does not need to be altered

Returns:

SCTK binary folder

i6_core.tests.job_tests.recognition.test_scoring.test_sclite_job()

i6_core.tests.job_tests.returnn.test_convert

i6_core.tests.job_tests.returnn.test_convert.test_corpus_replace_orth_from_reference_corpus()

i6_core.tests.job_tests.returnn.test_vocabulary

i6_core.text.label.sentencepiece.train

class i6_core.text.label.sentencepiece.train.SentencePieceType(value)

An enumeration.

BPE = 'bpe'
CHAR = 'char'
UNIGRAM = 'unigram'
WORD = 'word'
class i6_core.text.label.sentencepiece.train.TrainSentencePieceJob(*args, **kwargs)

Train a sentence-piece model to be used with RETURNN

See also `https://returnn.readthedocs.io/en/latest/api/datasets.util.vocabulary.html#returnn.datasets.util.vocabulary.SentencePieces`_

Parameters:
  • training_text (tk.Path) – raw text or gzipped text

  • vocab_size (int) – target vocabulary size for the created model

  • model_type (SentencePieceType) – which sentence model to use, use “UNIGRAM” for “typical” SPM

  • character_coverage (float) – official default is 0.9995, but this caused the least used character to be dropped entirely

  • additional_options (dict|None) – additional trainer options, see `https://github.com/google/sentencepiece/blob/master/doc/options.md`_

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.text.label.subword_nmt.apply

class i6_core.text.label.subword_nmt.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)

Apply BPE codes to a Bliss lexicon file

Parameters:
  • bliss_lexicon (Path) –

  • bpe_codes (Path) –

  • bpe_vocab (Path|None) –

  • subword_nmt_repo (Optional[Path]) –

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.label.subword_nmt.apply.ApplyBPEToTextJob(*args, **kwargs)

Apply BPE codes on a text file

Parameters:
  • text_file – words text file to convert to bpe

  • bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes

  • bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab

  • subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob

  • gzip_output – use gzip on the output text

  • mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.text.label.subword_nmt.train

class i6_core.text.label.subword_nmt.train.ReturnnTrainBpeJob(*args, **kwargs)

Create Bpe codes and vocab files compatible with RETURNN BytePairEncoding Repository:

This job can be used to produce BPE codes compatible to legacy (non-sisyphus) RETURNN setups.

Outputs:
  • bpe_codes: the codes file to apply BPE to any text

  • bpe_vocab: the index vocab in the form of {“<token>”: <index>, …} that can be used e.g. for RETURNN

    Will contain <s> and </s> pointing to index 0 and the unk_label pointing to index 1

  • bpe_dummy_count_vocab: a text file containing all words, to be used with the ApplyBPEToTextJob

    DOES NOT INCLUDE COUNTS, but just set each count to -1. Is used to not cause invalid merges when converting text to the BPE form.

  • vocab_size: variable containing the number of indices

Parameters:
  • text_file – corpus text file, .gz compressed or uncompressed

  • bpe_size (int) – number of BPE merge operations

  • unk_label (str) – unknown label

  • subword_nmt_repo (Path|None) – subword nmt repository path. see also CloneGitRepositoryJob

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.label.subword_nmt.train.TrainBPEModelJob(*args, **kwargs)

Create a bpe codes file using the official subword-nmt repo, either installed from pip or https://github.com/rsennrich/subword-nmt

This job is deprecated, to create BPE codes that are compatible with legacy (non-sisyphus) RETURNN setups using e.g. language models from Kazuki, please use the ReturnnTrainBpeJob.

Otherwise, please consider using the sentencepiece implementation.

Parameters:
  • text_corpus (Path) –

  • symbols (int) –

  • min_frequency (int) –

  • dict_input (bool) –

  • total_symbols (bool) –

  • subword_nmt_repo (Optional[Path]) –

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.text.processing

class i6_core.text.processing.ConcatenateJob(*args, **kwargs)

Concatenate all given input files (gz or raw)

Parameters:
  • text_files (list[Path]) – input text files

  • zip_out (bool) – apply gzip to the output

  • out_name (str) – user specific name

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.processing.HeadJob(*args, **kwargs)

Return the head of a text file, either absolute or as ratio (provide one)

Parameters:
  • text_file (Path) – text file (gz or raw)

  • num_lines (int) – number of lines to extract

  • ratio (float) – ratio of lines to extract

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.processing.PipelineJob(*args, **kwargs)

Reads a text file and applies a list of piped shell commands

Parameters:
  • text_files (iterable[Path]|Path) – text file (raw or gz) or list of files to be processed

  • pipeline (list[str|DelayedBase]) – list of shell commands to form the pipeline, can be empty to use the job for concatenation or gzip compression only.

  • zip_output (bool) – apply gzip to the output

  • check_equal_length (bool) – the line count of the input and output should match

  • mini_task (bool) – the pipeline should be run as mini_task

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.processing.SetDifferenceJob(*args, **kwargs)

Return the set difference of two text files, where one line is one element.

This job performs the set difference minuend - subtrahend. Unlike the bash utility comm, the two files do not need to be sorted. :param Path minuend: left-hand side of the set subtraction :param Path subtrahend: right-hand side of the set subtraction :param bool gzipped: whether the output should be compressed in gzip format

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.text.processing.TailJob(*args, **kwargs)

Return the tail of a text file, either absolute or as ratio (provide one)

Parameters:
  • text_file (Path) – text file (gz or raw)

  • num_lines (int) – number of lines to extract

  • ratio (float) – ratio of lines to extract

run()
class i6_core.text.processing.WriteToTextFileJob(*args, **kwargs)

Write a given content into a text file, one entry per line

Parameters:

content (list|dict|str) – input which will be written into a text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.tools.compile

class i6_core.tools.compile.MakeJob(*args, **kwargs)

Executes a sequence of make commands in a given folder

Parameters:
  • folder – folder in which the make commands are executed, e.g. a GitCloneRepositoryJob output

  • make_sequence – list of options that are given to the make calls. defaults to [“all”] i.e. “make all” is executed

  • configure_opts – if given, runs ./configure with these options before make

  • num_processes – number of parallel running make processes

  • output_folder_name – name of the output path folder, if None, the repo is not copied as output

  • link_outputs – provide “output_name”: “local/repo/file_folder” pairs to link (or copy if output_folder_name=None) files or directories as output. This can be used to access single binaries or a binary folder instead of the whole repository.

classmethod hash(kwargs)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks() Iterator[Task]
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.tools.download

class i6_core.tools.download.DownloadJob(*args, **kwargs)

Download an arbitrary file with optional checksum verification

If a checksum is provided the url will not be hashed

Parameters:
  • url (str) –

  • target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url

  • checksum (str|None) – A sha256 checksum to verify the file

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.tools.git

class i6_core.tools.git.CloneGitRepositoryJob(*args, **kwargs)

Clone a git repository given optional branch name and commit hash

Parameters:
  • url (str) – git repository url

  • branch (str) – git branch name

  • commit (str) – git commit hash

  • checkout_folder_name (str) – name of the output path repository folder

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

i6_core.util

class i6_core.util.MultiOutputPath(creator, path_template, hidden_paths, cached=False)
class i6_core.util.MultiPath(path_template, hidden_paths, cached=False, path_root=None, hash_overwrite=None)
i6_core.util.add_suffix(string: str, suffix: str) str
i6_core.util.backup_if_exists(file: str)
i6_core.util.cached_path(path: Union[str, Path]) Union[str, bytes]
i6_core.util.check_file_sha256_checksum(filename: str, reference_checksum: str)

Validates the sha256sum for a file against the target checksum

Parameters:
  • filename – a single file to be checked

  • reference_checksum – checksum to verify against

i6_core.util.chunks(l: List, n: int) List[List]
Parameters:
  • l – list which should be split into chunks

  • n – number of chunks

Returns:

yields n chunks

i6_core.util.compute_file_sha256_checksum(filename: str) str

Computes the sha256sum for a file

Parameters:

filename – a single file to be checked

Returns:

checksum

:rtype:str

i6_core.util.create_executable(filename: str, command: List[str])

create an executable .sh file calling a single command :param filename: executable name ending with .sh :param command: list representing the command and parameters :return:

i6_core.util.delete_if_exists(file: str)
i6_core.util.delete_if_zero(file: str)
i6_core.util.get_executable_path(path: Optional[Path], gs_member_name: Optional[str], default_exec_path: Optional[Path] = None) Path

Helper function that allows to select a specific version of software while maintaining compatibility to different methods that were used in the past to select software versions. It will return a Path object for the first path found in

Parameters:
  • path – Directly specify the path to be used

  • gs_member_name – get path from sisyphus.global_settings.<gs_member_name>

  • default_exec_path – general fallback if no specific version is given

i6_core.util.get_g2p_path(g2p_path: Path) Path

gets the path to the sequitur g2p script

i6_core.util.get_g2p_python(g2p_python: Path) Path

gets the path to a python binary or script that is used to run g2p

i6_core.util.get_returnn_python_exe(returnn_python_exe: Path) Path

gets the path to a python binary or script that is used to run RETURNN

i6_core.util.get_returnn_root(returnn_root: Path) Path

gets the path to the root folder of RETURNN

i6_core.util.get_subword_nmt_repo(subword_nmt_repo: Path) Path

gets the path to the root folder of subword-nmt repo

i6_core.util.get_val(var: Any) Any
i6_core.util.instanciate_delayed(o: Any) Any

Recursively traverses a structure and calls .get() on all existing Delayed Operations, especially Variables in the structure

Parameters:

o – nested structure that may contain DelayedBase objects

Returns:

i6_core.util.num_cart_labels(path: Union[str, Path]) int
i6_core.util.partition_into_tree(l: List, m: int) List[List]

Transforms the list l into a nested list where each sub-list has at most length m + 1

i6_core.util.reduce_tree(func, tree)
i6_core.util.remove_suffix(string: str, suffix: str) str
i6_core.util.uopen(path: Union[str, Path], *args, **kwargs) Union[open, open]
i6_core.util.update_nested_dict(dict1: Dict[str, Any], dict2: Dict[str, Any])

updates dict 1 with all the items from dict2, both dict1 and dict2 can be nested dict

i6_core.util.write_paths_to_file(file: Union[str, Path], paths: List[Union[str, Path]])
i6_core.util.write_xml(filename: ~typing.Union[<sisyphus.toolkit.RelPath object at 0x7f1e203dd110>, str], element_tree: ~typing.Union[~xml.etree.ElementTree.ElementTree, ~xml.etree.ElementTree.Element], prettify: bool = True)

writes element tree to xml file :param filename: name of desired output file :param element_tree: element tree which should be written to file :param prettify: prettify the xml. Warning: be careful with this option if you care about whitespace in the xml.

i6_core.util.zmove(src: Union[str, Path], target: Union[str, Path])

i6_core.vtln.features

i6_core.vtln.features.VTLNFeaturesJob(crp, feature_flow, map_file, extra_warp_args=None, extra_config=None, extra_post_config=None)

i6_core.vtln.flow

i6_core.vtln.flow.add_static_warping_to_filterbank_flow(feature_net, alpha_name='warping-alpha', omega_name='warping-omega', node_name='filterbank')
i6_core.vtln.flow.label_features_with_map_flow(feature_net, map_file, map_key='$(id)', default_output=1.0)
i6_core.vtln.flow.recognized_warping_factor_flow(feature_net, alphas_file, mixtures, filterbank_node='filterbank', amplitude_spectrum_node='amplitude-spectrum', omega=0.875)
i6_core.vtln.flow.warp_filterbank_with_map_flow(feature_net, map_file, map_key='$(id)', default_output=1.0, omega=0.875, node_name='filterbank')

i6_core.vtln.train

Indices and tables